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=> fil reg; s ggggg^ggaagggctaattcactcccaa/s A 
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COPYRIGHT (C) 1993 American Chemical Society (ACS) 
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EXCLUDE SEARCH OF COMPLEMENTARY STRAND Y/(N)?:. 
LI 35. GGGGGACTGGAAGGGCTAATTCACTCCCAA/SQSN Se<3 . cJ<3«^ 1 ] 



=> fil ca; s 11 

FILE 'CA' ENTERED AT 14:42:04 ON 26 APR 93 
USE IS SUBJECT TO THE TERMS OF YOUR CUSTOMER AGREEMENT 
COPYRIGHT (C) 1993 AMERICAN CHEMICAL SOCIETY (ACS) 

FILE COVERS 1967 - 13 Apr 93 (930413/ED) VOL 118 ISS 16. 
For OFFLINE Prints or Displays, use the ABS or ALL formats to obtain 
abstract graphic structures. The AB format DOES NOT display structure 
diagrams. 



L2 9 LI 

=> d 1-9 .beverly; sel hit 12 1-9 rn 

L2 ANSWER 1 OF 9 COPYRIGHT 1993 ACS 
AN CA117(21) :206366s 
TI Molecular clones of HIV-1 strains MN-ST1 and BA-L and preparation of 

vaccines with antigenic proteins of these strains 
SO PCT Int. Appl., 55 pp. 

AU Reitz, Marvin S., Jr.; Franchini, Genoveffa; Markham, Phillip D.; 
Gallo, Robert C. ; Lori, Franco C. ; Popovic, Mikulas; Garnter, 
Suzanne 

AI WO 91-US7611 17 Oct 1991 
PI WO 9206990 Al 30 Apr 1992 
PY 1992 

AB HIV-1 strain MN-ST1 cDNA and a Hindlll fragment of strain BA-L cDNA 
are cloned and sequenced. Plasmids for expression of infectious 
viruses or env protein were prepd. Restriction maps of MN-ST1 
prophage cDNA and of the cDNA fragment from unintegrated BA-L DNA 
are presented. 

L2 ANSWER 2 OF 9 COPYRIGHT 1993 ACS 
AN CA116(6) :46279q 

TI Non-infectious HIV-1 particles and their use as vaccines 
SO PCT Int. Appl., 59 pp. 

AU Young, Richard A.; Baltimore, David; Aldovini, Anna; Trono, Didier; 
Feinberg, Mark B. 

AI WO 90-US5932 16 Oct 1990 
PI WO 9105860 Al 2 May 1991 
PY 1991 

AB Noninfectious HIV-1 particles are produced using plasmids which 

encode HIV-1 mutants which are defective in viral packaging. These 
particles may be used as vaccines. Plasmids encoding HIV-1 with a 
deletion in the .vphi. site and/or substitution mutations in the 
metal-binding motifs of the gag gene were prepd. and the constructs 
were introduced into COS-1 cells. HIV-1 particles were produced but 
the particles were not infectious (as detd. by failure to infect H9 
T leukemia cell line) . 



0^ /o6o"7/ 



L2 ANSWER 3 OF 9 Jk)PYRIGHT 1993 ACS ^ 
AN CA115(25) :2726^ri ™ 
TI Molecular clones of HIV-1 and their uses 

SO U. S. Pat. Appl., 61 pp. Avail. NTIS Order No. PAT-APPL-6-599 491. 

AU Reitz, Marvin 

AI US 91-599491 31 Jan 1991 

PI US 599491 AO 1 Aug 1991 

PY 1991 

AB The cDNA sequences representing the complete genomes of HIV-1 

strains MN-PH1 and MN-ST1 are presented as in the cDNA for the env 
gene of a third HIV-1 strain, BA-L. The cDNAs can be used to produce 
anti-HIV-1 vaccines and for diagnosis of HIV-1 infection (no data) . 
Expression plasmids for the env gene proteins of the strains were 
prepd. A eukaryotic expression plasmid contg. the entire MN-ST1 cDNA 
was prepd. for use in prodn. of the virus. 

L2 ANSWER 4 OF 9 COPYRIGHT 1993 ACS 
AN CA114(17) :162208y 

TI Production of a nonfunctional nef protein in human immunodeficiency 

virus type 1-infected CEM cells 
SO J. Gen. Virol., 71(10), 2273-81 

AU Laurent, Anne G.; Hovanessian, Ara G.; Riviere, Yves; Krust, 

Bernard; Regnault, Armelle; Montagnier, Luc; Findeli, Annie; Kieny, 
Marie Paule; Guy, Bruno 

PY 1990 

AB The nef gene product of the human immunodeficiency virus (HIV) is 
suggested* to be a neg. factor involved in down-regulating viral 
expression by a mechanism in which the correct conformation of the 
nef protein is essential. The nef protein expressed by vaccinia 
virus recombinants is phosphorylated by protein kinase C. The 
present study investigated the synthesis of the nef protein and its 
state of phosphorylation during HIV-1 infection of a T4 cell line 
(CEM cells). Max. synthesis of viral proteins occurred 3 days after 
infection, when more than 90% of cells were producing viral 
proteins. The synthesis of the nef protein was detected in parallel 
with the env and gag proteins. As expected, the nef protein was 
myristylated but not phosphorylated and its half-life was less than 
1 h. By the use of the polymerase chain reaction technique, the nef 
gene of this HIV-1 stock was isolated and sequenced. Two significant 
mutations were obsd. Firstly threonine, at amino acid no. 15, the 
site of phosphorylation by protein kinase C, was mutated into an 
alanine, and secondly aspartic acid of the tetrapeptide WRFD, which 
is probably involved in GTP binding, was mutated into an asparagine. 
The mutated nef gene was expressed in a vaccinia virus system, in 
which is was not phosphorylated and its half-life was dramatically 
reduced compared to the wild-type nef gene product. Furthermore, 
down-regulation of CD4 cell surface expression was no longer 
affected by the mutated nef gene. These results emphasize that 
phosphorylation of the nef protein provides an efficient test to 
monitor its biol. activity. 

L2 ANSWER 5 OF 9 COPYRIGHT 1993 ACS 
AN CA111(19) :168198e 

TI Biological and molecular characterization of human immunodeficiency 
virus (HIV-1BR) from the brain of a patient with progressive 
dementia 

SO Virology, 168(1), 79-89 

AU Anand, Rita; Thayer, Richard; Srinivasan, A.; Nayyar, S.; Gardner, 

Murray; Luciw, Paul; Dandekar, Satya 
PY 1989 

AB HIV-1BR was isolated from the autopsied brain tissue of a 57-yr-old 
man who died of progressive dementing illness. This virus was shown 



to be HIV-l by^br idization to HIV-specif^BDNA probes* The 
expression of viral proteins as tested by radioimmunopptn. assay 
revealed the presence of HIV-l specific proteins, HIV-1BR replicated 
in cultures of CD4+ T-lymphoid cells and induced cytopathic effects 
in these cells. HIV-1BR also replicated in monocytoid cell lines. 
The genetic nature of this isolate was detd. by mol. cloning and 
sequencing of the 3 '-half of the genome. DNA sequence information 
established that HIV-1BR is a unique HIV-l isolate. A stretch of 
.apprx.30 bases in the nef gene of HIV-1BR was found duplicated when 
compared with the other sequenced HIV-l genomes. The functional 
significance of this duplication remains to be detd. 

L2 ANSWER 6 OF 9 COPYRIGHT 1993 ACS 
AN CA108(1) :1299q 

TI Complete nucleotide sequences of functional clones of the AIDS virus 
SO AIDS Res. Hum. Retroviruses, 3(1), 57-69 

AU Ratner, Lee; Fisher, Amanda; Jagodzinski, Linda L. ; Mitsuya, 

Hiroaki; Liou, Ruey Shyan; Gallo, Robert C; Wong-Staal, Flossie 
PY 1987 

AB To examine the mechanism of lymphocytotoxicity induced by human 
T-lymphotropic virus type III/lymphadenopathy assocd. virus 
(HTLV-III/LAV) , an in vitro model has been developed. Introduction 
of an HTLV-III/LAV proviral clone, HXB2 , into normal lymphocytes 
results in the prodn. of virions and cell death. The complete 
nucleotide sequence of the proviral form of HXB2 has now been detd. 
Its structure is quite similar to that previously detd. for 
HTLV-III/LAV clones whose biol. capacities had not previously been 
demonstrated. The biol. function of 2 addnl. clones of HTLV-III/LAV, 
BH10 and HXB3 , are reported. Clone BH10 which lacks the 5' long 
terminal repeat sequences (LTR) and a portion of the 3'LTR is 
reconstituted by substituting the corresponding sequences of HXB2 
and is capable of generating infectious cytopathic virions. Clone 
HXB3 , which has been partially sequenced, is also capable of 
producing lymphocytopathic virus. Clone HXB3 differs from HXB2 in 
its lack of a termination codon in 3 'orf, demonstrating that 3 'orf 
plays no major role in virus replication or cytopathic activity. 
These data provide the necessary background to allow the 
identification of viral determinants of replication, cytopathic 
activity, and antigenicity using these functional proviral clones. 

L2 ANSWER 7 OF 9 COPYRIGHT 1993 ACS 
AN CA105(1) :1450v 

TI Three novel genes of human T-lymphotropic virus type III: immune 
reactivity of their products with sera from acquired immune 
deficiency syndrome patients 

SO Proc. Natl. Acad. Sci. U. S. A., 83(7), 2209-13 

AU Arya, Suresh K. ; Gallo, Robert C. 

PY 1986 

AB Human T-lymphotropic virus type III or lymphoadenopathy assocd. 
virus (HTLV-III/LAV) is the cause of acquired immune deficiency 
syndrome (AIDS) . In addn. to the conventional retroviral genes 
involved in virus replication, namely, gag, pol, and env genes, DNA 
sequence anal, of HTLV-III genome predicted 2 addnl. open reading 
frames, termed short open reading frame (sor) and 3' open reading 
frame (3' orf ) . Further, functional anal, revealed another gene with 
transactivating function, termed tat. These HTLV-III specific genes 
were structurally identified and functionally characterized by cDNA 
cloning. DNA sequence anal, of the clones shows that the tat and 3' 
orf genes contain 3 exons and their transcription into functional 
mRNA involves 2 splicing events and that the sor gene contains 
.gtoreq.2 exons. In vitro transcription and translation of the 
cloned spliced sequences show that the sor, tat, and 3 ' orf genes 



code for polyp^ftides with apparent mobil^fc of 24-25 kilodaltons 
(kDa), 14-15 kDa, and 26-28 kDa, resp. Al^f polypeptides are immune 
reactive and are immunogenic in the natural host. Thus, the 3 extra 
open reading frames of HTLV-III, 2 of which are unique to HTLV-III, 
are genes that function in vivo and code for 3 new and previously 
unrecognized HTLV-III antigens with differential immunogenic ity in 
individuals with acquired immune deficiency syndrome and related 
disorders. 

L2 ANSWER 8 OF 9 COPYRIGHT 1993 ACS 
AN CA102(21) :179952m 

TI Nucleic acid structure and expression of the human 

AIDS/lymphadenopathy retrovirus 
SO Nature (London), 313(6002), 450-8 

AU Muesing, Mark A.; Smith, Douglas H. ; Cabradilla, Cirilo D,; Benton, 

Charles v.; Lasky, Laurence A.; Capon, Daniel J. 
PY 1985 

AB The 9213-nucleotide structure of the acquired immune deficiency 
syndrome (AIDS) / lymphadenopathy virus has been detd. from mol. 
clones representing the integrated provirus and viral RNA. The 
sequence reveals that the virus is highly polymorphic and lacks 
significant nucleotide homol. with type C retroviruses characterized 
previously. Together with an anal, of the 2 major viral subgenomic 
RNAs, these studies establish the coding frames for the gag, pol and 
env genes and predict the expression of a novel gene at the 3' end 
of the genome unrelated to the X genes of human T-lymphotrophic 
virus I and II. 

L2 ANSWER 9 OF 9 COPYRIGHT 1993 ACS 
AN CA102(15) :126416h 

TI Nucleotide sequence of the AIDS virus, LAV 
SO Cell (Cambridge, Mass.), 40(1), 9-17 

AU Wain-Hobson, Simon; Sonigo, Pierre; Danos, Olivier; Cole, Stewart; 

Alizon, Marc 
PY 1985 

AB The complete 9 19 3 -nucleotide sequence of the probable causative 
agent of acquired immune deficiency syndrome (AIDS) , 
lymphadenopathy-assocd. virus (LAV), was detd. The deduced genetic 
structure is unique; it shows, in addn. to the retroviral gag, pol, 
and env genes, 2 novel open reading frames which were designated Q 
and F. Remarkably, Q is located between pol and env, and F is 
half -encoded by the U3 element of the long terminal repeat. Thus, 
LAV is distinct from the previously characterized family of human T 
cell leukemia (lymphoma) viruses. 



El THROUGH E14 ASSIGNED 
=> fil reg; s el-el4 

FILE 'REGISTRY' ENTERED AT 14:43:18 ON 26 APR 93 

USE IS SUBJECT TO THE TERMS OF YOUR CUSTOMER AGREEMENT 

COPYRIGHT (C) 1993 American Chemical Society (ACS) 

STRUCTURE FILE UPDATES: 23 APR 93 HIGHEST RN 147199-92-6 
DICTIONARY FILE UPDATES: 25 APR 93 HIGHEST RN 147199-92-6 



1 137574-23-3/RN 

1 102686-56-6/RN 

1 111804-75-2/RN 

1 111804-83-2/RN 

1 123056-88-2/RN 

■ 




1 13 3 11^^6-0 /RN 

1 13836W52-4/RN 
1 138362-53-5/RN 
1 138362-54-6/RN 
1 138362-55-7/RN 
1 138362-56-8/RN 
1 95568-14-2/RN 
1 96098-36-1/RN 
1 96098-41-8/RN 

L3 14 (137574-23-3/RN OR 102686-56-6/RN OR 111804-75-2/RN OR 111 

804-83-2/RN OR 123056-88-2/RN OR 133 172-96-0/RN OR 138362- 
52-4/RN OR 138362-53-5/RN OR 138362-54-6/RN OR 138362-55-7 
/RN OR 138362-56-8/RN OR 95568-14-2/RN OR 96098-36-1/RN OR 
96098-41-8/RN) 

=> d 1-14 .bevreg; fil ca; e alizon, m/au 10 

L3 ANSWER 1 OF 14 COPYRIGHT 1993 ACS 
RN 138362-56-8 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 

pA14-15HXB) (9CI) (CA INDEX NAME) 
SQL 9609 

MF Unspecified 

CI MAN 

L3 ANSWER 2 OF 14 COPYRIGHT 1993 ACS 

RN 138362-55-7 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 

pA15HXB) (9CI) (CA INDEX NAME) 

SQL 9606 

MF Unspecified 

CI MAN 

L3 ANSWER 3 OF 14 COPYRIGHT 1993 ACS 

RN 138362-54-6 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 

pA4HXB) (9CI) (CA INDEX NAME) 
SQL 9606 

MF Unspecified 

CI MAN 

L3 ANSWER 4 OF 14 COPYRIGHT 1993 ACS 

RN 138362-53-5 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 

pA3HXB) (9CI) (CA INDEX NAME) 

SQL 9607 

MF Unspecified 

CI MAN 

L3 ANSWER 5 OF 14 COPYRIGHT 1993 ACS 

RN 138362-52-4 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 

bCA20-W13) (9CI) (CA INDEX NAME) 
SQL 9613 
MF Unspecified 
CI MAN 

L3 ANSWER 6 OF 14 COPYRIGHT 1993 ACS 
RN 137574-23-3 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 clone 
. lambda . BA-L1 gene env plus 5'- and 3 '-flanking region fragment) 

(9CI) (CA INDEX NAME) 



SQL 3807 

MF Unspecified 

CI MAN 

L3 ANSWER 7 OF 14 COPYRIGHT 1993 ACS 

RN 133172-96-0 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus 1 gene nef) 

(9CI) (CA INDEX NAME) 
SQL 621 

MF Unspecified 

CI MAN 

L3 ANSWER 8 OF 14 COPYRIGHT 1993 ACS 
RN 123056-88-2 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone pATZ6 
gene nef) (9CI) (CA INDEX NAME) 

SQL 657 

MF Unspecified 

CI MAN 

L3 ANSWER 9 OF 14 COPYRIGHT 1993 ACS 
RN 111804-83-2 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone HXB2 
13-kilodalton protein gene) (9CI) (CA INDEX NAME) 

SQL 621 

MF Unspecified 

CI MAN 

L3 ANSWER 10 OF 14 COPYRIGHT 1993 ACS 

RN 111804-75-2 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone HXB2) 

(9CI) (CA INDEX NAME) 
SQL 9177 

MF Unspecified 

CI MAN 

L3 ANSWER 11 OF 14 COPYRIGHT 1993 ACS 
RN 102686-56-6 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone pSP-12 

27-kilodalton protein gene) (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN Deoxyribonucleic acid (human T-cell leukemia provirus type III clone 
pSP-12 27Hcilodalton protein gene) 

SQL 642 

MF Unspecified 

CI MAN 

L3 ANSWER 12 OF 14 COPYRIGHT 1993 ACS 
RN 96098-41-8 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone H9pv.22 
protein E' gene) (9CI) (CA INDEX NAME) 

OTHER NAMES: 

CN Deoxyribonucleic acid (lymphadenopathy/AIDS provirus clone H9pv.22 
protein E' gene) 

SQL 621 

MF Unspecified 

CI MAN 

L3 ANSWER 13 OF 14 COPYRIGHT 1993 ACS 

RN 96098-36-1 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone 

H9pv.22) (9CI) (CA INDEX NAME) 




OTHER NAMES: A jfe 

CN Deoxyribonucleic^icid (lymphadenopathy/AIDl^rovirus clone H9pv.22) 

SQL 9213 

MF Unspecified 

CI MAN 



L3 ANSWER 14 OF 14 COPYRIGHT 1993 ACS 
RN 95568-14-2 REGISTRY 

CN Deoxyribonucleic acid (human immunodeficiency provirus clone 

. lambda. J19) (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN Deoxyribonucleic acid ( lymphadenopathy-associated provirus clone 
. lambda. J19) 

SQL 9193 

MF Unspecified 

CI MAN 



FILE 'CA' ENTERED AT 14:43:48 ON 26 APR 93 

USE IS SUBJECT TO THE TERMS OF YOUR CUSTOMER AGREEMENT 

COPYRIGHT (C) 1993 AMERICAN CHEMICAL SOCIETY (ACS) 

FILE COVERS 1967 - 13 Apr 93 (930413/ED) VOL 118 ISS 16. 

For OFFLINE Prints or Displays, use the ABS or ALL formats to obtain 
abstract graphic structures. The AB format DOES NOT display structure 
diagrams. 



El 22 ALIZON, J/AU 

E2 3 ALIZON, JOSEPH/ AU . -i 

E3 2 --> ALIZON, M/AU -AoThoVlS 

E4 25 ALIZON, MARC/AU 

E5 3 ALJ, A/AU 

E6 1 ALJ, A E/AU 

E7 1 ALJABAB, A/AU 

E8 1 ALJABRE, S H M/AU 

E9 8 ALJADEFF, GLADIS/AU 

E10 1 ALJADEHEFF, GLADIS/AU 

=> s e3-e4; e sonico, p/au 10 

2 "ALIZON, M"/AU 
25 "ALIZON, MARC"/AU 
L4 27 ("ALIZON, M"/AU OR "ALIZON, MARC"/AU) 



El 6 

E2 1 

E3 0 — > 

E4 2 

E5 1 

E6 4 

E7 3 

E8 1 

E9 3 

E10 26 



SONICH, V P/AU 
SONICH, V V/AU 
SONICO, P/AU 
SONIDIS, GEORGE P/AU 
SONIDO, E P/AU 
SONIE, K C/AU 
SONIER, FELIX/ AU 
SONIER, FERNAND/AU 
SONIGO, P/AU 
SONIGO, PIERRE/ AU 



=> e stewart, c/au 10 



El 2 STEWART, BRUCE N/AU 

E2 1 STEWART, BURCH BYRON/ AU 

E3 18 — > STEWART, C/AU 

E4 14 STEWART, C A/AU 

E5 1 STEWART, C A JR/AU 



E6 
E7 
E8 
E9 
E10 



5 
10 
7 
1 
6 



STEWART , 
STEWART , 
STEWART , 
STEWART , 



C B/AU 
C C/AU 
C D/AU 
C E/AU 
C E E/AU 




=> s e3; e Stewart, cole/au 10 

L5 18 "STEWART, C"/AU 



El 
E2 
E3 
E4 
E5 
E6 
E7 
E8 
E9 
E10 



2 STEWART 
1 STEWART 
1 — > STEWART 

3 STEWART 

1 STEWART 

2 STEWART 
11 STEWART 
15 STEWART 

1 STEWART 

4 STEWART 



CLIVE EDWARD E/AU 

CLIVE EDWARD ERNEST /AU 

COLE/AU 

COLIN/AU 

COLIN C/AU 

COLIN CROSBIE/AU 

COLIN L/AU 

COLIN S/AU 

COLIN SAMUEL/AU 

CONSTANCE B/AU 



=> s e3 ; s 15 or 16; e danos, o/au 10 



L6 


1 


"STEWART, 


COLE"/AU 


L7 


19 


L5 OR L6 




El 


57 


DANOS, 


MICHAEL/AU 


E2 


1 


DANOS, 


MICHEL/AU 


E3 


6 


— > DANOS, 


O/AU 


E4 


1 


DANOS, 


OLIVER/AU 


E5 


22 


DANOS, 


OLIVIER/AU 


E6 


1 


DANOS, 


OLIVIER F/AU 


E7 


1 


DANOS, 


P T/AU 


E8 


1 


DANOS, 


R J/AU 


E9 


5 


DANOS, 


ROBERT J/AU 


E10 


1 


DANOS, 


SAVAS C/AU 


=> s e3-e6; 


e wain-hobson 


, s/au 9 




6 


"DANOS, 0" 


/AU 



L8 



1 
22 

1 
30 



"DANOS, OLIVIER"/AU 
"DANOS, OLIVIER F"/AU 

("DANOS, 0 H /AU OR "DANOS, OLIVER"/ AU OR "DANOS, 
U OR "DANOS, OLIVIER F"/AU) 



OLIVIER" /A 



El 
E2 
E3 
E4 
E5 
E6 
E7 
E8 
E9 

=> e wain, 

El 
E2 
E3 



3 
1 
0 
2 
1 
1 
28 
1 
2 



WAIN, WILLIAM H/AU 
WAIN, WILLIAM HENRY/ AU 
-> WAIN-HOBSON, S/AU 
WAINAI, HIDEKI/AU 
WAINAI, TASUKU/AU 
WAINAI, TOHORU/AU 
WAINAI, TOHRU/AU 
WAINAI, TOORU/AU 
WAINAI, TORU/AU 



10 



s/au 
1 
1 

0' — > 



WAIN, 
WAIN, 
WAIN, 



RUSSELL/AU 
RUSSELL EDMUND /AU 
S/AU 



E4 




1 


W/AU 


E5 




8 


WAIN, VI 


1 H/AU 


E6 




3 


WAIN, WILLIAM H/AU 


E7 




1 


WAIN, WILLIAM HENRY/AU 


E8 




2 


WAINAI , 


HIDEKI/AU 


E9 




1 


WAINAI , 


TASUKU/AU 






1 


WAINAI , 


TOHORU/AU 


=> e 


hobson, s/au 10 




El 




1 


HOBSON , 


ROY BAXTER/ AU 


E2 




1 


HOBSON , 


RUSSELL B JR/AU 


E3 




4 


— > HOBSON, 


S/AU 


E4 




1 


HOBSON, 


SIMON WAIN/AU 


E5 






HOBSON, 


T /ATI 


E6 




4 


HOBSON , 


W/AU 


E7 




8 


HOBSON, 


W C/AU 


E8 




115 


HOBSON , 


W S/AU 


E9 




2 


HOBSON, 


W T/AU 


LIU 




8 


HOBSON, 


WILLIAM/ AU 


=> S 


e3- 


e4 










4 


"HOBSON, S" 


/AU 






1 


"HOBSON, SIMON WAIN'VAU 


T O 




5 


("HOBSON, S 


"/AU OR "HOBSON, 


=> S 


14 


and 17 


and 18 and 


19; s 14 and (17 


9); s 


18 


and 19 


LIO 




0 


L4 AND L7 AND L8 AND L9 


Lll 




3 


L4 AND (L7 


OR L8 OR L9) 


L12 




0 


L7 AND (L8 


OR L9) 


L13 




0 


L8 AND L9 






SIMON WAIN"/AU) 



=> S (14 or 17 

lymphotrop? or 

245 
98 
264 
130 
1504 
827 
6282 
5288 
667 
536 
307992 
309481 
84015 
123349 
8477 

L14 30 



or 18 or 19) and (lav or lymphadenopath? or htlv or hiv or 

human ( 2 w) virus?) /ab,bi 

LAV/AB 

LAV/BI 

LYMPHADENOPATH? / AB 

LYMPHADENOPATH? / BI 

HTLV/AB 

HTLV/BI 

HIV/AB 

HIV/BI 

LYMPHOTROP? /AB 
LYMPHOTROP? /BI 
HUMAN /AB 
HUMAN/BI 
VIRUS? /AB 
VIRUS? /BI 
HUMAN (2W) VIRUS? 

(L4 OR L7 OR L8 OR L9) AND (LAV OR LYMPHADENOPATH? OR HTLV 
OR HIV OR LYMPHOTROP? OR HUMAN (2W) VIRUS?) /AB,BI 



=> s 114 and clon?/ab,bi 

84117 CLON?/AB 

54670 CL0N7/BI 
L15 17 L14 AND CLON?/AB,BI 



=> s 115 and sequencT/ab, bi 

199071 SEQUENC?/AB 

10323 5 SEQUENC7/BI 
L16 14 L15 AND SEQUENC? / AB , BI 

=> s (111 or 116) not 12 
L17 15 (Lll OR L16) NOT L2 

=> d 1-15 .beverly; fil biosi; s alizon m ?/au; s sonico p ?/au; s Stewart 
c ?/au; s danos o ?/au; s (hobson s ? or wain s ?)/au 

L17 ANSWER 1 OF 15 COPYRIGHT 199 3 ACS 
AN CA116(5) :39665j 

TI Immunogenic peptides of a variant of LAV ( 

lymphadenopathy virus) 
SO U.S. , 49 pp. 

AU Alizon, Marc; Sonigo, Pierre; Wain-Hobson, Simon; Montagnier, Luc 

AI US 87-38332 13 Apr 1987 
PI US 5034511 A 23 Jul 1991 
PY 1991 

AB Immunogenic peptide sequences from LAVELI are presented. 

An immunogenic compn. comprising such a peptide and a physiol. 
acceptable carrier as well as a diagnostic kit for detecting 
antibodies to LAV comprising such a peptide and a reagent 
for detecting the formation of peptide/ antibody complex are also 
claimed. Sequences are claimed from env, gag, and pol 
proteins. The complete cDNA of LAVELI is presented. The 
sequence was compared with those for other LAV . 

L17 ANSWER 2 OF 15 COPYRIGHT 1993 ACS 
AN CA112 (1) :2059f 

TI Expression vectors for manufacture of human 

immunodeficiency virus 2 (HIV2) proteins 
SO Fr. Demande, 31 pp. 

AU Kieny, Marie Paule; Rautmann, Guy; Guy, Bruno; Montagnier, Luc; 

Alizon, Marc; Girard, Marc 
AI FR 87-12396 7 Sep 1987 
PI FR 2620030 AI 10 Mar 1989 
PY 1989 

AB Viral or plasmid vectors which can be used to manuf . HIV2 proteins 

in eukaryotes or prokaryotes are described. The HIV2 proteins can be 
used as vaccines or to prep, antibodies. Both proteins and 
antibodies can be used in diagnosis. The cDNA for HIV2 protein F was 
cloned in plasmid pTG186POLY, and this plasmid used to prep. 

recombinant vaccinia virus by std. means. BHK21 cells were infected 
with this recombinant virus. Protein which was recognized by serum 
from HIV2 pos. patients was produced by these transf ormants. 

L17 ANSWER 3 OF 15 COPYRIGHT 1993 ACS 
AN CAlll(l) :2164r 

TI Peptides having immunological properties of HIV-2 ( 

human immunodeficiency virus ) for diagnosis and 

vaccines and simian immunodeficiency virus genome cDNA 

sequence 
SO PCT Int. Appl., 96 pp. 

AU Alizon, Marc; Montagnier, Luc; Guetard, Denise; Clavel, Francois; 

Sonigo, Pierre; Guyader, Mireille; Tiollais, Pierre; Chakrabarti, 

Lisa; Desrosiers, Ronald 
AI WO 88-FR25 15 Jan 1988 
PI WO 8805440 AI 28 Jul 1988 
PY 1988 




AB Peptides having^fcmunol. properties in com^B with HIV-2, 
particularly the envelope glycoprotein of HTV -21, and with 
the glycoprotein of SIV-1 (simian immunodeficiency virus) are useful 
in detecting infection with HIV-2 and in vaccines. 
Diagnostic kits and cDNA sequences esp. for SIV-1 macaque 
are also included. The DNA of HUT 78 cells infected with SIV-1 of 
macaque was partially digested with restriction endonuclease Sau 345 
and cloned in the BamHI of .lambda, to construct a gene 
bank. The recombinant phages were screened using sequences 
of HIV -2. One clone , . lambda. SIV-1, had a 

16 . 5-kilobase insert comprising the entire provirus genome lacking 
only 250 bases at the left long terminal repeat region. The 
nucleotide sequence was detd. by the dideoxynucleotide 
method after subcloning in phage M13mp8. 

L17 ANSWER 4 OF 15 COPYRIGHT 1993 ACS 
AN CA110(17) :152651r 

TI Envelope antigens of lvmphadenopathy -associated virus and 

their applications 
SO PCT Int. Appl., 78 pp. 

AU Montagnier, Luc; Krust, Bernard; Chamaret, Solange; Clavel, 

Francois; Chermann, Jean Claude; Barre-sinoussi, Francoise; Alizon, 
Marc; Sonigo, Pierre; Stewart, Cole; et al. 

AI WO 85-EP548 18 Oct 1985 

PI WO 8602383 Al 24 Apr 1986 

PY 1986 

AB Purified expression products of DNA sequences derived from 
the lvmphadenopathy -assocd . virus ( LAV ) genome, 

particularly a 110, 000-mol. -wt . glycoprotein or derived antigenic 

peptides which are recognized by human sera contg. antibodies 

against LAV, are prepd. The glycoprotein is used in the 

prepn. of „ monoclonal antibodies and in the prodn. of an immunogenic 

compn. capable of neutralizing LAV . The glycoprotein or 

polypeptides are also useful in the diagnosis of LAV 

antibodies in sera of patients. T-lymphocytes derived from healthy 

and LAVl-inf ected donors were cultivated in a nondenaturing medium 

contg. cysteine-35S. The supernatant, from the culture medium was 

centrifuged at 10,000 rpm for 10 min to remove the nonviral 

components, then at 45,000 rpm for 20 min to sediment the virus. The 

virus pellet was then lysed by detergent in the presence of 

aprotinin and the envelope glycoprotein (gpllO) was purified by 

affinity chromatog. on Sephrose-Con A and eluted with 

0-methyl-. alpha. -D-mannopyranoside. The gpllO was used to immunize 

mice for the prodn. of monoclonal antibodies by std. hybridoma 

methodol. The sequencing and detn. of peptide or protein 

sites of particular interest were carried out on a recombinant phage 

corresponding to . lambda. J19 or LAV-Ia. 

L17 ANSWER 5 OF 15 COPYRIGHT 1993 ACS 
AN CA109(15) :123790j 

TI Variants of lymphadenopathy -associated viruses, their cDNA 
and protein sequences and their use, particularly for 
diagnostic purposes and for the preparation of immunogenic 
compositions 

SO PCT Int. Appl., 72 pp. 

AU Alizon, Marc; Sonigo, Pierre; Wain-Hobson, Simon; Montagnier, Luc 
AI WO 87-EP326 22 Jun 1987 
PI WO 8707906 Al 30 Dec 1987 
PY 1987 

AB Two new variants of lymphadenopathv -assocd . viruses ( 
LAV ) designated LAVILI and LAVMAL are isolated and their 

genomes characterized. Their DNAs and antigens can be used for the 



diagnosis of A^p and prodn. of vaccines ^^pinst AIDS. The viruses 
were isolated from African patients from jSffre. The genetic 
organization of the two new isolates, esp. the region between the 
pol and env genes, is identical to that of the other isolates. The 
sizes of the U3 , R, and U5 elements of the long terminal repeat are 
also conserved. Substantial differences are obsd. in the primary 
structure of their proteins; the envelope is more variable that the 
gag and pol gene proteins. 

L17 ANSWER 6 OF 15 COPYRIGHT 1993 ACS 
AN CA109(11) :89337e 

TI Retrovirus of the human immunodeficiency virus 2 

(HIV-2) type capable of inducing AIDS, its antigenic and 

nucleic acid constituents, and diagnostic and therapeutic methods 

and kits 

SO PCT Int. Appl., 117 pp. 

AU Montagnier, Luc; Chamaret, Solange; Guetard, Denise; Alizon, Marc; 

Clavel, Francois; Guyader, Mireille; Sonigo, Pierre; Brun-Vezinet, 

Francoise; Rey, Marianne; et al. 
AI WO 87-FR25 22 Jan 1987 
PI WO 8704459 Al 30 Jul 1987 
PY 1987 

AB Retrovirus HIV -2 and its antigenic and nucleic acid 

components are useful in diagnostic (e.g. antibody immunoassays) and 
therapeutic methods and kits. Protein antigens pl2, pl6, p26, and 
gpl40 and genetic material have been prepd. Glycoprotein gpl40 is 
particularly useful in immunogenic compns. Nucleotide 

sequences useful as hybridization probes are disclosed. 

HIV of patients from west Africa was isolated by stimulating 

their peripheral blood lymphocytes (PBLs) with PHA and cultivating 
in coculture with normal PBLs so stimulated and maintained in the 
presence of inter leukin-2 . The viruses were centrifuged, lysed, and 
deposited on nitrocellulose. The samples were treated with an 

HIV-1 probe corresponding to the complete genome of LAVBRU 
or an HIV -2 probe derived from a 2-kb cDNA clone 
of LAV - 2 ROD, both labeled with 32P, under stringent 
hybridization conditions. All of the virus samples hybridized with 
the HIV -2 probe only. 

L17 ANSWER 7 OF 15 COPYRIGHT 1993 ACS 
AN CA108(23) :199491n 

TI Preparation of recombinant viral vectors encoding human 

immunodeficiency virus ( HIV ) glycoprotein for 

use as anti-AIDS vaccine 
SO Fr. Demande, 36 pp. 

AU Kieny, Marie Paule; Rautmann, Guy; Lecocq, Jean Pierre; Hobson, 

Simon Wain; Girard, Marc; Montagnier, Luc 
AI FR 86-5043 8 Apr 1986 
PI FR 2596771 Al 9 Oct 1987 
PY 1987 

AB Viral vectors which encode HIV env protein or variants 

thereof are constructed, mammalian cells are infected with them, and 
the immunogenicity of the recombinant proteins are analyzed. Plasmid 
pTG1125 contg. , inserted into the vaccinia virus thymidine kinase 
gene, the HIV env gene under the control of the vaccinia 
virus 7.5K protein gene promoter was constructed. Viral vector 
W.TG. eLAV 1125 was prepd. by in vivo recombination of pTG1125 with 
vaccinia virus. BHK21 cells infected with this vector produced 
glycoproteins of mol. wt. 160, 120, and 40 kilodaltons which were 
recognized by antiserum isolated from AIDS patients. Balb/c mice 
infected with this vector produced antibodies which reacted with 
160- and 40-kilodalton proteins in sera of AIDS patients. 



LI 7 ANSWER 8 OF 15 COPYRIGHT 1993 ACS 
AN CA108(13) :107210u 
TI Sequence analysis of the human immune deficiency 
virus type 2 

SO UCLA Symp. Mol. Cell. Biol., New Ser., 71(Hum. Retroviruses, Cancer, 
AIDS), 31-42 

AU Guyader, M. ; Emerman, M. ; Sonigo, P.; Clavel, F.; Montagnier, L. ; 

Alizon, M. 
PY 1988 

AB Cloned cDNA probes made from human immunodeficiency type 2 
virus (HIV-2) were used to screen a genomic library made 
from a T4*cell line infected with the ROD isolate of HIV 
-2. Lambda clones contg. proviral DNA were characterized 
by restriction mapping, and then used to det. the complete 
9671-nucleotide sequence of the genome. The genomic 
organization of HIV -2 was 5'LTR-gag-pol-central 

region-env-orf F-3 'LTR; the central region contained 4 genes related 
to those of HIV -1 (sor, R, tat, and art) as well as a 5th 
gene (designated X) with no counterpart in HIV -1. 

HIV -1 and HIV -2 differed significantly in terms of 

nucleotide and amino acid sequence . The more conserved gag 
and pol genes displayed only 56 and 60% nucleotide sequence 
homol. and both <60% of amino acid identity. Calcn. of the 
nucleotide sequence homol. for the other genes gave even 
lower values, giving HIV -1 and 2 overall 42% homologous. 
To det. whether or not the tat gene of HIV -1 could 
trans-activate the LTR of HIV -2 and vice versa, SW480 
cells were cotransf ected with subgenomic fragments of HIV 
-1 or HIV -2 and pHIV2-CAT or a plasmid pHIVl-CAT which 
contained U3-R of HIV-1. Both HIV-1 and 

HIV -2 LTRs were substantially activated by the HIV 
-1 tat gene. 

L17 ANSWER 9 OF 15 COPYRIGHT 1993 ACS 
AN CA108(1) :1300h 

TI Sequence of simian immunodeficiency virus from macaque and 

its relationship to other human and simian retroviruses 
SO Nature (London), 328(6130), 543-7 

AU Chakrabarti, Lisa; Guyader, Mireille; Alizon, Marc; Daniel, Muthiah 

D. ; Desrosiers, Ronald C; Tiollais, Pierre; Sonigo, Pierre 
PY 1987 

AB The complete genome of the proviral form of simian immunodeficiency 
virus isolated from a naturally infected macaque was cloned 
(. lambda. SIV1) and sequenced . The genome of SIVmac was 
9643 nucleotides long with its open reading frames and was organized 
(5'LTR-gag-pol-central region-env-F-3 'LTR) in a manner typical of a 
lentivirus. Comparisons of the proteins of SIV with those of 
HIV -1 and HIV -2 quantified the relatedness of 
these viruses. 

L17 ANSWER 10 OF 15 COPYRIGHT 1993 ACS 
AN CA106(11) :79452n 

TI Molecular cloning and polymorphism of the human 

immune deficiency virus type 2 
SO Nature (London), 324(6098), 691-5 

AU Clavel, Francois; Guyader, Mireille; Guetard, Denise; Salle, 

Mireille; Montagnier, Luc; Alizon, Marc 
PY 1986 

AB A novel retrovirus, human immune deficiency virus 
type 2 ( HIV -2) , was isolated and characterized. 
Hybridization expts. indicated that there are substantial 



differences be^Aen the DNA sequences of I^B" 2 

and HIV-1. Moreover, the serol. cross-rea^wvity of the 

proteins of the 2 viruses is restricted to the core protein. The 

9.5-kilobase genome of HIV- 2 was cloned . 

Different isolates of HIV - 2 exhibited restriction site 

polymorphism in their DNAs. The relationship of HIV- 2 with 

other human and simian retroviruses is discussed. 



L17 ANSWER 11 OF 15 COPYRIGHT 1993 ACS 
AN CA106(5) :28512z 

TI Cloned DNA sequences , hybridizable with genomic 

RNA of lymphadenopathy -associated virus ( lav ) 
SO PCT Int. Appl., 39 pp. 

AU Alizon, Marc; Barre Sinoussi, Francoise; Sonigo, Pierre; Tiollais, 
Pierre; Chermann, Jean Claude; Montagnier, Luc; Wain-Hobson, Simon 
AI WO 85-EP487 18 Sep 1985 
PI WO 8601827 Al 27 Mar 1986 
PY 1986 

AB Cloned DNA fragments contg. sequences 

hybridizable to genomic RNA and DNA of lymphadenopathv 
-assocd. retrovirus ( LAV ) are obtained from a cDNA library 
of the LAV genome. These DNA fragments are useful as 
hybridization probes for detection of LAV in biol. samples 
taken from persons possibly afflicted with AIDS. The complete 
sequence and restriction map of the LAV provirus 
genome are presented. 

L17 ANSWER 12 OF 15 COPYRIGHT 1993 ACS 
AN CA105(21) :185219f 

TI AIDS virus env protein expressed from a recombinant vaccinia virus 
SO Bio/Technology, 4(9), 790-5 

AU Kieny, M. P.; Rautmann, G. ; Schmitt, D.; Dott, K.; Wain-Hobson, S.; 

Alizon, M. ; Girard, M. ; Chamaret, S.; Laurent, A.; et al. 
PY 1986 

AB Lymphadenopathy -assocd . virus ( LAV ) in the 

causative agent of AIDS, the acquired immunodeficiency syndrome. A 
retrovirus of the lentivirus group, LAV carries a single 
major target antigen at its surface: the env protein. The env coding 
sequence was introduced into a vaccinia virus vector. The 

live recombinant virus, WTGeLAV, dets. the prodn. of env protein in 
infected mammalian cells. The recombinant protein reacts with sera 
from AIDS patients and appear to be processed and glycosylated in a 
manner identical to authentic env of LAV retrovirus. 
Inoculation of mice with WTGeLAV elicits high titers of antisera 
recognizing vaccinia determinants but only low titers of antibody 
recognizing env proteins of LAV . Cells infected with the 
recombinant virus rapidly liberate a processed form of the env 
protein into the culture medium. This shedding of surface antigen 
from AIDS virus may play a role in the pathophysiol. of the disease. 

L17 ANSWER 13 OF 15 COPYRIGHT 1993 ACS 
AN CA105(9) :73424n 

TI Lymphadenopathy /AIDS virus: genetic organization and 

relationship to animal lentiviruses 
SO Anticancer Res., 6(3, Pt. B) , 403-12 
AU Alizon, Marc; Montagnier, Luc 
PY 1986 

AB A review with 46 refs. on the mol. characterization of the probable 
agent of the acquired immune deficiency syndrome (AIDS) , the 
lvmphadenopathy /AIDS virus (LAV). Mol. 
cloning and complete nucleotide sequencing of 
LAV allows a detailed comparison with other AIDS virus 



isolates, as we^pas with other human and ^naal retroviruses. The 
AIDS virus is closely related to visna virt^ prototype of the 
lentiviruses, whereas the other human retroviruses, i.e., human 
T-cell leukemia viruses type I and II (HTLV-I and II) , are 
quite remote in the evolution. 

L17 ANSWER 14 OF 15 COPYRIGHT 1993 ACS 
AN CA103(19) :155030d 

TI Nucleotide sequence of the Visna lentivirus: relationship to the 
AIDS virus 

SO Cell (Cambridge, Mass.), 42(1), 369-82 

AU Sonigo, Pierre; Alizon, Marc; Staskus, Katherine; Klatzmann, David; 

Cole, Stewart; Danos, Olivier; Retzel, Ernest; Tiollais, Pierre; 

Haase, Ashley; Wain-Hobson, Simon 
PY 1985 

AB The complete 92 02 nucleotide sequence of the visna lentivirus was 
detd. The deduced genetic organization most closely resembles that 
of the AIDS retrovirus in that there is a novel central region sepg. 
pol and env. Moreover, there is a close phylogenetic relation 
between the conserved reverse transcriptase and 

endonuclease/integrase domains of the visna and AIDS viruses. These 
findings support the inclusion of the AIDS virus in the retroviral 
subfamily Lentivirinae. 

L17 ANSWER 15 OF 15 COPYRIGHT 1993 ACS 
AN CA102(9) :73509g 

TI Molecular cloning of lymphadenopathy -associated 
virus 

SO Nature (London), 312(5996), 757-60 

AU Alizon, Marc; Sonigo, Pierre; Barre-Sinoussi, Francoise; Chermann, 
Jean Claude; Tiollais, Pierre; Montagnier, Luc; Wain-Hobson, Simon 
PY 1985 

AB DNA complementary to human l ymphadenopathy 
-assocd. virus ( LAV ) RNA was cloned on 

plasmid pBR327, and the recombinant DNA was used to transform 
Escherichia coli. Plasmid pLAV13 carrying a 2.5-kilobase insert was 
isolated and its nick-translated DNA used as a hybridization probe 
to detect virion RNA in infected cells. LAV virion RNA was 
detected in infected normal T-cells, FR8 and other B cell lines, CEM 
cells, and bone marrow cells from a hemophiliac with AIDS, but not 
in uninfected normal T lymphocyte cells or normal liver. Plasmid 
pLAV13, which did not integrate into the human genome, detected both 
RNA and integrated DNA forms in LAV -inf ected cells. 
Genomic LAV sequences were similarly 
cloned by inserting Hindlll digests of genomic DNA of 
LAV -inf ected T cells into a phage .lambda, vector; 5 

recombinants that hybridized with nick-translated pLAV13 were 
obtained. 
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L41 ANSWER 1 OF 8 COPYRIGHT 1993 BIOSIS 
AN 89:438231 BIOSIS 

TI PACKAGING AND TRANSFER OF A MARKER GENE BY HIV VECTOR PARTICLES. 
AU CLAVEL F; DANOS O : ALIZON M 

SO MORISSET, R. A. (ED.)* VE CONFERENCE INTERNATIONALE SUR LE SIDA: LE 
DEFI SCIENTIFIQUE ET SOCIAL; V INTERNATIONAL CONFERENCE ON AIDS: THE 
SCIENTIFIC AND SOCIAL CHALLENGE; MONTREAL, QUEBEC, CANADA, JUNE 4-9, 
1989. 1262P. INTERNATIONAL DEVELOPMENT RESEARCH CENTRE: OTTAWA, 
ONTARIO, CANADA. ILLUS. PAPER. 0 (0). 1989. 583. ISBN: 
0-662-56670-X 

L41 ANSWER 2 OF 8 COPYRIGHT 1993 NLM 
AN 87287230 MEDLINE 

TI Sequence of simian immunodeficiency virus from macaque and 
its relationship to other human and simian retroviruses. 

AU Chakrabarti L; Guyader M; Alizon M ; Daniel MD; Desrosiers 
RC; Tiollais P; Sonigo P 

SO Nature, (1987 Aug 6-12) 328 (6130) 543-7 
Journal code: NSC ISSN: 0028-0836 

AB Because of the growing incidence of AIDS (acquired immune deficiency 
syndrome), the need for studies on animal models is urgent. Infection 
of chimpanzees with the retroviral agent of human AIDS, the 
human immunodef iciency virus ( HIV ) , will 

have only limited usefulness because chimpanzees are in short supply 
and do not develop the disease. Among non-human primates, both type D 
retroviruses and lentiviruses can be t responsible for immune 
deficiencies. The D-type retroviruses, although important pathogens 
in macaque monkey colonies, are not satisfactory as a model because 
they differ in genetic structure and pathophysiological properties 
from the human AIDS viruses . The simian 
lentivirus, previously referred to as simian T-cell 
lvmphotropic virus type III (STLV-III) , now termed simian 
immunodeficiency virus (SIV) is related to HIV by the 
antigenicity of its proteins and in its main biological properties, 
such as cytopathic effect and tropism for CD4-bearing cells. Most 
importantly, SIV induces a disease with remarkable similarity to 
human AIDS in the common rhesus macaques, which therefore constitute 
the best animal model currently available. Natural or experimental 
infection of other monkeys such as African green monkeys or sooty 
mangabeys has not yet been associated with disease. Molecular 
approaches' of the SIV system will be needed for biological studies 
and development of vaccines that could be tested in animals. We have 
cloned and sequenced the complete genome of SIV 

isolated from a naturally infected macaque that died of AIDS. This 

SIVMAC appears genetically close to the agent of AIDS in West Africa, 

HIV-2, but the divergence of the sequences of SIV 

and HIV -2 is greater than that previously observed between 

HIV-1 isolates. 

L41 ANSWER 3 OF 8 COPYRIGHT 1993 NLM 
AN 87090385 MEDLINE 

TI Molecular cloning and polymorphism of the human 

immune deficiency virus type 2. 
AU Clavel F; Guyader M; Guetard D; Salle M; Montagnier L; Alizon 
M 

SO Nature, (1986 Dec 18-31) 324 (6098) 691-5 
Journal code: NSC ISSN: 0028-0836 



AB We recently reported the isolation of a novel retrovirus, the 
human immune deficiency virus type 2 ( HIV 
-2, previously named LAV- 2) , from patients with acquired 
immune deficiency syndrome (AIDS) originating from West Africa. This 
virus is related to HIV -1, the causative agent of the AIDS 
epidemic now spreading in Central and East Africa, as well as the USA 
and Europe- (see ref . 3 for review) both by its morphology and by its 
tropism and in vitro cytopathic effect on CD4 (T4) positive cell 
lines and lymphocytes. But preliminary hybridization experiments 
indicated that there are substantiated differences between the 
sequences of the two genomes. Furthermore, the proteins of 
HIV -1 and HIV - 2 have different sizes and their 

serological cross-reactivity is restricted to the major core protein, 

as the envelope glycoproteins of HIV - 2 are not 

immunoprecipitated by HIV-l-positive sera. We now report 

the molecular cloning of the complete 9.5-kilobase (kb) 

genome of HIV-2, the observation of restriction site 

polymorphism between different isolates, and a preliminary analysis 

of the relationship of HIV -2 with other human and simian 

retroviruses . 

L41 ANSWER 4 OF 8 COPYRIGHT 1993 BIOSIS DUPLICATE 1 

AN 86:379265 BIOSIS 

TI LYMPHADENOPATHY -ACQUIRED IMMUNE DEFICIENCY SYNDROME VIRUS 

GENETIC ORGANIZATION AND RELATIONSHIP TO ANIMAL LENTIVIRUSES. 
AU ALIZON M ; MONTAGNIER L 

SO ANTICANCER RES 6 (3 PART B) . 1986. 403-412. CODEN: ANTRD4 ISSN: 
0250-7005 

AB This article presents data obtained by our group in the molecular 
characterization of the probable agent of the acquired immune 
deficiency syndrome (AIDS) , the lymphadenopathy /AIDS virus 
( LAV ) . Molecular cloning and complete nucleotide 
sequencing of LAV allows a detailed comparison with 
other AIDS virus isolates, as well as other human and animal 
retroviruses. We have now molecular evidence that the AIDS virus is 
closely related to visna virus, prototype of the lentiviruses, 
whereas the other human retroviruses, i.e., human T-cell leukemia 

1 viruses type I and II (HTLV-I and II) , are quite remote in 
the evolution. 

L41 ANSWER 5 OF 8 COPYRIGHT 1993 BIOSIS DUPLICATE 2 

AN 86:377557 BIOSIS 

TI GENETIC VARIABILITY OF THE ACQUIRED IMMUNE DEFICIENCY SYNDROME VIRUS 
NUCLEOTIDE SEQUENCE ANALYSIS OF TWO ISOLATES FROM AFRICAN 
PATIENTS. 

AU ALIZON M ; WAIN-H0BSON S; MONTAGNIER L; SONIGO P 

SO CELL 46 (1). 1986. 63-74. CODEN: CELLB5 ISSN: 0092-8674 

AB To define further the genetic variability of the human AIDS 
retrovirus, we have cloned and sequenced the 

complete genomes of two isolates obtained from Zairian patients. 
Their genetic organization is identical with that of isolates from 
Europe and" North America, confirming a common evolutionary origin. 
However, the comparison of homologous proteins from these different 
isolates reveals a much greater extent of genetic polymorphism than 
previously observed. It is nevertheless possible to define conserved 
domains in the viral proteins, especially in the envelope, that could 
be of interest for the understanding of the molecular mechanisms of 
viral pathogenicity and for the development of diagnostic and 
therapeutic reagents. 

L41 ANSWER 6 OF 8 COPYRIGHT 1993 BIOSIS DUPLICATE 3 

AN 86:99324 BIOSIS 



TI NUCLEOTIDE SEQUENCE OF THE VISNA LENTIVIRUS RELATIONSHIP TO THE AIDS 
ACQUIRED IMMUNE DEFICIENCY SYNDROME VIRUS. 

AU SONIGO P; ALIZON M ; STASKUS K; KLATZMANN D; COLE S; 

DANOS O ; RETZEL E; TIOLLAIS P; HAASE A; WAIN-HOBSON S 

SO CELL 42 (1). 1985. 369-382. CODEN: CELLB5 ISSN: 0092-8674 

AB We have determined the complete 9202 nucleotide sequence of the visna 
lentivirus. The deduced genetic organization most closely resembles 
that of the AIDS retrovirus in that there is a novel central region 
separating pol and env. Moreover, there is a close phylogenetic 
relationship between the conserved reverse transcriptase and 
endonuclease/integrase domains of the visna and AIDS viruses. These 
findings support the inclusion of the AIDS virus in the retroviral 
subfamily Lentivirinae. 

L41 ANSWER 7 OF 8 COPYRIGHT 1993 BIOSIS DUPLICATE 4 

AN 85:296617 BIOSIS 

TI NUCLEOTIDE SEQUENCE OF THE ACQUIRED IMMUNE DEFICIENCY SYNDROME VIRUS 

LYMPHADENOPATHY-ASSOCIATED VIRUS. 
AU WAIN-HOBSON S; SONIGO P; DANOS O : COLE S; ALIZON M 
SO CELL 40 (1). 1985. 9-18. CODEN: CELLB5 ISSN: 0092-8674 
AB The complete 9 19 3 -nucleotide sequence of the probable causative agent 
of AIDS [acquired immune deficiency syndrome] , lymphadenopathy- 
associated virus (LAV) , was determined. The deduced genetic structure 
is unique: it shows, in addition to the retroviral gag, pol and env 
genes, 2 novel open reading frames termed Q and F. Remarkably, Q is 
located between pol and env and F is half-encoded by the U3 element 
of the LTR [long terminal repeat]. The data place LAV apart from the 
previously characterized family of human T cell leukemia/ lymphoma 
viruses . 

L41 ANSWER 8 OF 8 COPYRIGHT 1993 NLM 
AN 85086249 MEDLINE 

TI Molecular cloning of lymphadenopathy -associated 
virus . 

AU Alizon M ; Sonigo P; Barre-Sinoussi F; Chermann JC; Tiollais 
P; Montagnier L; Wain-Hobson S 

SO Nature, (1984 Dec 20-1985 Jan 2) 312 (5996) 757-60 
Journal code: NSC ISSN: 0028-0836 

AB Lymphadenopathy -associated virus ( LAV ) is a human 

retrovirus first isolated from a homosexual patient with 
lymphadenopathy syndrome, frequently a prodrome or a benign 
form of acquired immune deficiency syndrome (AIDS) . Other LAV 
isolates have subsequently been recovered from patients with AIDS or 
pre-AIDS and all available data are consistent with the virus being 
the causative agent of AIDS. The virus is propagated on activated T 
lymphocytes and has a tropism for the T-cell subset OKT4 (ref . 6) , in 
which it induces a cytopathic effect. The major core protein of 
LAV is antigenically unrelated to other known retroviral 
antigens. LAV - like viruses have more recently been 

independently isolated from patients with AIDS and pre-AIDS. These 
viruses, called human T-cell leukaemia/ lymphoma virus type III ( 
HTLV-III) and AIDS-associated retrovirus (ARV) , seem to have 
many characteristics in common with LAV and probably 
represent independent isolates of the LAV prototype. We 
have sought to characterize LAV by the molecular 
cloning of its genome. A cloned LAV 

complementary DNA was used to screen a library of recombinant phages 

constructed from the genomic DNA of LAV -inf ected T 

lymphocytes. Two families of clones were characterized 

which differ in a restriction site. The viral genome is longer than 

any other human retroviral genome (9.1-9.2 kilobases) . 
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PARAMETERS 



Similarity matrix Unitary 


K-tuple 


4 


Mismatch penalty 1 


Joining penalty 


30 


Gap penalty 1.00 


Window size 


32 


Gap size penalty 0*33 






Cutoff score 0 






Randomization group 0 






Initial scores to save 40 


Alignments to save 10 




Optimized scores to save 0 


Display context 50 




SEARCH STATISTICS 




Scores! Mean 


Hedian Standard Deviation 


21 


17 15.71 




Tines; CPU 


Total Elapsed 




00:04:07.93 


00:08:22,00 




Number of residues: 


12982290 




Number of sequences searched: 


20342 




Number of scores above cutoff: 


4112 





Cut-off raised to 10. 
Cut-off raised to 17. 
Cut-off raised to 25. 
Cut-off raised to 30. 
Cut-off raised to 33. 

The scores beloy are sorted by initial score. 
Significance is calculated based on initial score. 

A 100X identical sequence to the query sequence uas not found 



The list of best scores is: 



Sequence Name Description 



Init. Opt, 
Length Score Score Sig. Frame 



40 standard deviations above mean **** 
1. G14751 HIV-KHN) env orotein-encodin 9739 656 663 40.42 0 







39 standard deviations above nean 

W * mf W U f 1 W W ■ V* W W T •> W V «V V 1 t mf W !■/ V W It W **1 1 ^ 








2. 


322488 


HIV-1 proviral clone pNL4-3. 9709 640 
38 standard deviations above pip an 


640 


39.40 


0 


3. 


N60240 


HTLV-III virus (HIV virus) DN 9745 633 
36 standard deviations above nean **** 


623 


38.96 


0 


4. 


S14752 


HIV-1 (NN-ST1) env protein-enc 9746 602 
33 standard deviations above nean 


641 


36.98 


0 


5. 


N60365 


Seauence of LAV virus aenone 9193 554 


554 

mm mm f 


33.93 

W V f w 
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W W 


N60288 

■ ~ W w mm W w 


Seauence of the HTLV-III aeno 9213 547 


547 

W ■ r 
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W W • v W 
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7. 


N6G476 


Sequence of luflchadenoDathu-'a 9088 542 


542 

*mw ■ mm* 
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o 

mm 


8. 

• 


G 15226 


HIV-1 TAT fiRNA. 1833 541 

f*»T* III! III|lfl|l S> mm mm mm W 1 » 


545 


33.10 

WW . • V 


o 
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9. 


N71016 


Sequence of LAV7HTLV III enve 4020 54 J 
30 standard deviations above nean 


541 


33.10 


0 


10. 


N80436 

1 ~ t*> W ■ *jm mm 


Entire seauence of LAV EL I 9236 502 


502 

www 


30.62 

W V » w w 


o 
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li. 

# * ■ 


806635 


Coftolete seauence of HIV 1-ND 9143 499 


499 

¥ ¥ ¥ 


30.43 

mm w ■ * mmf 
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12. 


N60140 


Sequence of ARV-2 (9B) cDNA i 9737 493 
23 standard deviations above nean 


652 


30.05 


0 


13. 


61 1943 


Nucleotide sequence of HIV-1 9192 394 
22 standard deviations above nean 


513 


23.74 


0 


14. 


N80437 


Entire sequence of LAV HA L 9229 382 
16 standard deviations above nean 


470 


22.98 
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15. 


Q14753 


HIV-1 BA-L clone. 3807 282 
14 standard deviations above nean 


282 


16.61 


0 


16, 


NB0890 


Seouence of cDNA clone HIV-2 9633 246 


342 

mmr • W 
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mm ¥ 9 WW 


o 


17. 


N92119 


Sequence of clone HIV-2 SBL/1 9693 246 
13 standard deviations above nean ftftfttf 


342 


14.32 
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18. 


N71017 


Sequence of LAV/HTLV III gag 5340 235 

12 standard deviations above nean iHHHt 


235 


13.62 
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19. 


N90824 


HIV LTR gene structure. 718 221 
10 standard deviations above nean 


228 


12.73 
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20. 


Q21163 


COS cell expression vector pi 2932 192 
9 standard deviations above nean 


377 


10.89 
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Q02829 


DNA conolenentaru to sinian i 9170 177 
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^m m mm 
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W ■ * mm 
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Q20616 


ROD HIV-2 isolate conolete ae 9672 175 
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» . WW 
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23. 

mmt W * 


N92768 


HIV-2 variant HIV-D194 clone. 9473 174 
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9.74 

f m l ~ 
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N80859 


Seauence of entire HIV-2 ROD 9643 173 
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N92618 


Portion of the HIV-3 retrovir 360 167 
8 standard deviations above nean 
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SIVnac239 nef-deletion. 10097 151 

\m 4 r 1 1 U W 1» W * ■ ■ W 1 mm W « W> W A W IIP * l# W / f A WT « 


343 


8.28 


A 

V 


28. 
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SIVnac239 proviral aenone. 10279 151 
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DNA seauence of expression ve 1143 149 
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lr m, * ¥ W ■! T W v W y*r ■ 111? V v W ■ ( mm* %m Wx f W V ¥ mmt W W * ■ mar 


381 


8.08 


o 

w 


31. 
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Expression vector piH3M. 3900 148 
7 standard deviations above nean 
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N92769 


HIV-2 variant HIV-D205 clone 324 137 
A standard deviations abovp npan 
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**** 5 standard deviations above nean **** 

35. Q13189 Synthetic TAR sequence. 120 109 114 5.60 0 

36. N93063 Sequence encoding hybrid prot 1383 105 266 5.35 0 
37 



37 N50333 Sequence of exons I and II an 986 104 292 5.28 0 

38. 020532 Sequence of clone laabdaAPCPl 2256 103 303 5.22 0 

39. 910014 Clone laabda APCP168i4 of bet 2256 103 303 5.22 0 

40. N80604 Lanbda APCP168i4,aflino acids 2256 103 303 5.22 0 



1. RAILEY-000-7I6.SE9 (1-696) 

Q14751 HIV-l(HN) env protein-encoding sequence. 

ID 914751 standard; DNA; 9739 BP. 

AC 914751; 

DT 05-FEB-1992 (first entry) 

DE HIV— 1 (MN) env nrntpin-pnrnriinn «ipnupnrp. 



KH human immunodeficiency virus; United States; HH isolate; AIDS; 

KM envelope protein; ss. 

OS Hunan immunodeficiency virus-l (MN). 

FH Key Location/Qualifiers 

FT CDS 6240.. 8810 

FT /Hag= a 

FT /product= env 

PN US7599491-A. 

PD 15-0CT-1991. 

PF 17-0CT-1990; 183830. 

PR 17-0CT-1990; US-599491. 

PA (USSH ) NAT INST OF HEALTH. 

PI Reitz Hi 

DR HPI; 91-346752/47. 

DR P-PSDB; R14903. 

PT US HIV-1 isolates MN-ST1 and BA-L, ENV protein and DNA - are 

PT useful in therapeuticsi vaccines and diagnostic tests 

PS Example 1; Fig 2; 61pp; English. 

CC The permuted circular unintegrated viral DNA representing the 

CC complete HIV-1 (NN) genome was cloned into the EcoRI site of lambda 

CC gtWES. lambda B DNA from total DNA of H9 cells producing HIV-1 (HN). 

CC This clone uas designated lambda MN-PH1; it uas subcloned in M13npl8 

CC and M13mpl9 and the DNA sequence of the entire clone was obtained. 

CC The four a 0THERS B in the sequence represent bases which are 

CC illegible in the specification. The amino acid sequence of the env 

CC protein was deduced from this sequence and the env gene uas 

CC subcloned so that recombinant production of the env protein uas 

CC possible. 

SO Sequence 9739 BP; 3457 A; 1774 C; 2313 G; 2191 T; 

SQ 4 Others; 



Initial Score = 656 Optimized Score = 663 Significance = 40.42 
Residue Identity = 962 Matches = 665 Hismatches = 21 

Gaps = 3 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 



80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

imiimmmi miimiiiimmmim mmiiiiiiimiiiimiiiiii 

GGCTACTTCCCTGATTAGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 



150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

1 1 1 1 i 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 iii mini iiiii iimiiiiimiimmiimi 

TACAAGCTAGTACCAGTTGAGCCAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACCAGCTTGTTACAC 
140 150 160 170 180 190 200 



220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f I i 1 !! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CCTGTGAGCCTGCATGGAATGGATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
210 220 230 240 250 260 270 280 



290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

lllllllll 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 i 1 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiimimmi III 

TTTCATCACATGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTCACATCGAGCTTGCTACAATGGA 
290 300 310 320 330 340 350 



370 380 390 400 41-0 * 4?0 410 



CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGC&AGCCCTCAGATGCTGC 

minium iiiiimmi iiiiiiiiiiiiinii iiiiiiimiiiiiiiiiiiiiiii iiii 

CTTTCCGCTGGGGACTTTCCAGGTAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

iiiiiiiiiiiiiiiiiiiiiMiiiiiiiiiiiiiiniiiiiiiiiiii iimiimimiiiiii 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

IIIIIIIIIIIINMIIIIMMIlllllilllllllllllllNMMllllMIIIIIIMIIIMIM 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

III lllllllllllll llllllllllllllll llllilli IIIIIIIIIIIIIIIIIIIIIIIIIIII 
GTTATGTGACTCTGGTAGCTAGAGATCCCTCAGATCCTTTTAGGCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

mmimmmimimi mm miiimm 

CCGAACAGGGACTTGAAAGCGAAAGAAAAACCA — GAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 X 690 700 710 

CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



2. RAILEY-000-716.SE9 (1-696) 

Q22488 HIV-1 proviral clone pNL4-3. 

ID 822488 standard; DNA; 9709 BP. 
AC 022488; 

DT 06-JUL-1992 (first entry) 
DE HIV-1 proviral clone pNL4-3. 

KW AIDS; Acquired IfiRune Deficiency Syndrone; polymerase chain reaction; 
Kti PCR; site-directed autagenesis; retrovirus? null-nutation; hunan? ss. 
OS HuRan ifmunodef iciency virus. 
H Key Location/Qualifiers 
repeat_region 1. ,634 
/Hag= a 

/rpt.type= TERMINAL 
/note* u 5'LTR a 
repeat_unit 456. .548 
/Hag= b 
/standard_nane= R 
GC_signal 375.. 385 

/Hag= c 

/5tandard_nane= Spl_binding_site 
GC.signal 389.. 395 

/Hag= d 

/standard_naRe= Spl_binding_site 
GC_signaf 3997.407 
/Hag= e 

/standard_nafle= Spl _binding_site 
priRer_bind 636. .656 
/Hag= f 

/standard_narce= Lys_tRNA_pbs 
CDS " 790~.2292 
/Hag= g 
/Droduct= qaQ 



CDS 2087.. 5096 

/Hag 2 h 
/product= pol 

/note= "NH2-terptinal uncertain 0 
CDS 5041.. 5619 

/Hag 2 i 
/product 2 vif 

CDS 5559.. 5849 

/Hag 2 j 
/product 2 vpr 

CDS 6061.. 6306 

/Hag 2 k 
/product 2 vpu 

exon 5830.. 6044 

/Hag 2 I 
/product= tat 

/note 2 "full-length tat obtained by splicing" 
exon 5969.. 6044 

/Hag 2 n 
/product 2 rev 

/note 2 a full-length rev obtained by splicing 0 
exon 8369.. 8414 

/Hag 2 n 
/product 2 tat 
/note= "see above" 
exon 8369.-8643 
/Hag 2 o 
/product 2 rev 
/note 2 "see above" 
CDS 6221.. 8785 

/Hag 2 p 
/product 2 env 

CDS 8787.. 9407 

/Hag 2 q 
/product 2 nef 

repeat_region 9076. .9709 
/Hag 2 r 

/rpt.type 2 TERHINAL 
/note 2 a 3 F LTR" 
repeat_unit 9531.. 9624 
/Hag= s 
/ standard jiane 2 R 
polyA^signal 9602.. 9607 
/Hag 2 t 
>N H09200987-A. 
PD 23-JAN-1992. 
T 10-JUL-1991; U04884. 
»R 12-JUL-1990' US-551945. 
PA (HARD ) HARVARD COLLEGE. 
PI Desrosiers RC. 
DR UPI; 92-056816/07. 

PT Prinate lentivirus vaccine protecting against AIDS - and prinate 
PT Antiviruses and their DNA clones contg. null autations, useful for 
PT producing vaccine 
PS Disclosure! Fig 3? 51pp; English. 

CC The proviral clone pNL4-3 yas used as the basis for creating the 

CC null-nutations of the invention. The clone uas described in 

CC Adachi et al.» J.Virol. 59:284, 1986. See 921079-Q21086 for 

CC examples of mutagenic primers for site-directed deletion of regions 

CC of NL4-3. 

SG Sequence 9709 BP; 3421 At* 1759 C; 2365 G; 2161 Tl 
SO 3 Others' 

Initial Score 2 640 Optimized Score 2 640 Significance = 39.40 
Residue Identity 2 92X Matches = 640 Mismatches = 49 

Gaos 2 0 Conservative Substitutions- = O 



10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

lllllillllim MINI lllllllll imilllllllill IIIIIIIIIII 
TGGAAGGGCTAATTTGGTCCCAAAAAAGACAAGAGATCCTTGATCTGTGGNNNCACCACACACAA 

X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

IIIIIIIIIIIIIIIilllllllllllllllllllHIIIII lllllllimillllllllillllllll 
GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 

70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

i mi iiiiiiiinii! inn i iiiiiiiiiiiiii mmim i iiiiiiiiiiin 

TTCAAGTTAGTACCAGTTGAACCAGAGCAAGTAGAAGAGGCCAAATAAGGAGAGAAGAACAGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

iii mm mm urn urn m iiimi mi urn iiiiimm mini 

CCTATGAGCCAGCATGGGATGGAGGACCCGGAGGGAGAAGTATTAGTGTGGAAGTTTGACAGCCTCCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

mi mi imim immiimmiii m iiimimimiiii minium 

TTTCGTCACATGGCCCGACAGCTGCATCCGGAGTACTACAAAGACTGCTGACATCGAGCTTTCTACAAGGGA 
290 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

MINIMUM lllllillllim lllllllllllll I ! 1 1 1 1 1 1 1 1 ! i 1 1 ! I i 1 1 ! 1 1 1 1 1 1 i I I 

CTTTCCGCTGGGGACTTTCCAGGGAGGTGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTAC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

1 1 1 ! 1 1 1 i 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 f 1 1 1 1 1 1 1 1 ! 1 1 millllllllllllllll 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

iiiiiiiiiiiniiiiimimmimiiiiimmiimm mimmiiimm 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTCAAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

1 1 M 1 1 1 1 1 ! 1 1 M M 1 1 1 II I ) 1 1 ! M 1 1 M 1 1 1 M i 1 1 1 1 1 1 1 ! 1 1 M M 1 1 M I ! I M M 1 1 1 ! 1 1 1 1 1 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

t ! 1 1 1 1 1 1 E I f 1 1 1 1 i 1 1 1 1 1 1 f 1 1 II lllllllll mill!) 

CCGAACAGGGACTTGAAAGCGAAAGTAAAGCCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 690 700 710 

CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



3. RAILEY-000-716.SEG (1-696J 



N60240 HTLV-1 1 1 virus (HIV virus) DNA. 



V- 



ID N60240 standard; DNA; 9745 BP. 

AC N60240; 

DT 01-JAN-1980 (first entry) 

DE HTLV-III virus (HIV virus) DNA. 

m HTLV-III; HIV virus; AIDS; active innunization; 

KU passive immunization; vaccine; ss. 

OS HIV virus (HTLV-III). 

FH Key Location/Qualifiers 

FT CDS 786.. 2318 

FT /Hag= a 

FT /note= "gag protein open reading frane 0 

FT CDS 2078.. 5122 

FT /Hag= b 

FT /note= °pol protein open reading frane 0 

FT CDS 5037.. 5646 

FT /Hag= c 

FT /note= °sor protein open reading fraae" 

FT CDS 6230.. 8818 

FT /Hag= d 

FT /note= B env-lor protein open reading frane" 

PN EP-185444-A. 

PD 25-JUN-1986. 

PF 10-0CT-1985; 307260. 
PR^10^OCT^984;=US^59339~ 

-P^:235JAt^l.9S5:;=USH69386^— 

PA (CENT-) CENTOCOR INC. 

C PI ^"'Chang NT ? > 

DR UPI; 86-163443/26. 

DR P-PSDB; P60346-49. 

PT Ney innunoreactive HTLV-III polypeptide expressed by transformed 

PT cells - and derived antibodiesi useful for diagnosis of AIDS and 

PT in active or passive iflRunisation 

PS Disclosure; Fig. 3; 60pp; English. 

CC HIV virus cDNA is cleaved with restriction endonucleases to produce 

CC fragments coding for the specified proteins. The resulting proteins: 

CC gag? poh sor and env-lor* and antibodies against then are useful 

CC for iRRunoassay of HIV virus» e.g. by sandwich type RIA. The 

CC proteins Ray also be used in vaccines for active iRRunization. 

SQ Sequence 9745 BP^ 3434 A; 1782 C; 2363 G; 2166 T; 

u Initial Score = 633\\0ptiRized Score = 623 Significance = 38.96 
i[ Residue Identity = 97X Hatches = 626 HisRatches = 13 

\ Gaps = 4 Conservative Substitutions = 0 

V — - — r " X 10 20 

GGGGGACTGGAAGGGCTAATTC 
lllllllllllillllllllll 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
9060 9070 9080 9090 9100 X 9110 9120 



30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

l!lli!ll!lll!ll!l!il!il!lfllll!lil!l!lllllillllllllllllllill!lllll Mill 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTAGCAGA 
9130 9140 9150 9160 9170 9180 9190 



100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

Illllllllillllllllll IMIIIIIIII1IIIIIIII1II1IIIIIIIIIIIIIIIIIIIIIIIIIII 
ACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 
9200 9210 9220 9230 9240 9250 9260 9270 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAG^ 



mi 111 limn urn Milium iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiinii 

CAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAAT 
9280 9290 9300 9310 9320 9330 9340 

240 250 260 270 280 290 300 

GGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGA 

lllllllll lilllllllllllllllllllllllllllllllllllllllllllllll llllllllll 

GGATGACCC— GGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACATGGCCCGAGA 
9350 9360 9370 9380 9390 9400 9410 

310 320 330 340 350 360 370 380 

GCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCC 

llllllllllllllllllilllllllllllllllllllllllllllllllilllllllilllll illlll! 

GCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCC 
9420 9430 9440 9450 9460 9470 9480 

390 400 410 420 430 _ 440 450 

AGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATAT^AGCAGCTGCTTTTTG 

lllllllllllllllllllll lllllllllllllllllllllllll illlllllllllllililllllli 

AGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGCATATAAGCAGCTGCTTTTTG 
9490 9500 9510 9520 9530 9540 9550 

460 470 480 490 500 510 520 

CCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGC 

lllllllllllllllllllllllllllllll IIIIIIIIIIIIIIIIIIINII llllllllllllllll 
CCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCKTGGCTAGCTAGGGAACCCACTGC 
9560 9570 9580 9590 9600 9610 9620 

530 540 550 560 570 580 590 

TTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACT 

IIIIIIIIIMMIIMIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

TTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACT 
9630 9640 9650 9660 9670 9680 9690 9700 i^g^^^N 

600 610 620 630 640 650 660 CS^ 1 ^ ^ 

AGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGCCCGAACAGGGACTTGAAAGC sjJ^M 0 ^ ju> 

lllillllllllilllllflllllllflllllllllllllllll! . ^J4^ s 

AGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCA\ -* $ M 

9710 9720 9730 9740 X\ v/JiV/ w . 

670 680 690 Js'LTZ 

GAAAGGGAAACCAGAGGAGCTCT ^ , 

4. RAILEY-000-716.SEG (1-696) 

Q14752 HIV-KHN-ST1) env protein-encoding sequence. 

ID Q14752 standard; DNA; 9746 BP. 

AC G14752; 

DT 05-FEB-1992 (first entry) 

DE HIV-1 (HN-ST1) env protein-encoding sequence. 

KW hupian innunodef iciency virus; United States; RN isolate; AIDS; 

KW envelope protein; ss. 

OS Husan immunodeficiency virus-l (HN). 

FH Key Location/Qualifiers 

FT CDS 6243.. 8806 

FT /Hag= a 

FT /product 2 env 

PN US7599491-A. 

PD 15-0CT-1991. 

PF 17-0CT-1990; 183830. 

PR 17-0CT-1990; US-599491. 

PA (USSH ) NAT INST OF HEALTH. 

PI Reitz M; 

DR MPI; 91-346752/47. 

DR P-PSDB? R14904. 



PT US HIV-1 isolates RN-ST1 and BA-L. ENV protein and DNA - are 

PT useful in therapeutics* vaccines and diagnostic tests 

PS Example 2; Fig 6; 61pp; English. 

CC The infectious nolecular clonei lanbda MN-STi j was obtained by 

CC cloning integrated provirus fron DNA purified fron peripheral blood 

CC lynphocytes infected uith HIV-UHN) and naintained in culture for 

CC one nonth. The integrated proviral DNA uas partially digested with 

CC Sau3A to give fragraents of 15-20 kb. The fragments uere cloned in 

CC EM8L3 and the entire sequence of the clone uas determined. 

SQ Sequence 9746 BP? 3465 A; 1752 C; 2355 G; 2174 T; 

Initial Score = 602 Optimized Score = 641 Significance = 36.98 
Residue Identity = 93X Matches = 645 Mismatches = 41 

Gaps = 5 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

mi in inn iiiiiin i mum miimimm imimiimi 

TGGATGGGTTAATTTACTCCCAAAG-AGACAAGACATCCTTGATCTGTGGGTCTACCACACACAA 
X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

1 1 1 i 1 1 1 1 1 f i i 1 1 ! f 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i 1 1 f 1 1 1 ! 1 1 1 i 1 1 1 1 1 1 1 1 1 f i 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

i mmiimiiiiimiim m mimmmmiiiiiimii i milium 

TTCAAGCTAGTACCAGTTGAGCCAGAGAAGATAGAAGAGGCCAATAAAGGAGAGAACAACTGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

m mm mm immiiii immiiimm inn n mum mini 

CCTATGAGCCAGCATGGGATGGATGACCCGGAGAGAGAAGTGTTAGTGTGGAAGTCTGACAGCCACCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

lllll II illllllllllllllllllllllll 1 1 1 1 II S M I M I j 1 1 1 1 M M Milium 

TTTCAGCATTATGCCCGAGAGCTGCATCCGGAGTACTACAAGAACTGCTGACATCGAGCTATCTACAAGGGA 
290 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

mimmii imimiimi iiimimm 11 mmmiimmmimm 

CTTTCCGCTGGGGACTTTCCAGGGAGGTGTGGCCTGGGCGGGACCGGGGAGTGGCGAGCCCTCAGATGCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

lllllllllllllllll 1 1 i 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 f 1 i f 1 1 f I i 1 1 1 1 1 1 miimiiimiiiiii 

ATATAAGCAGCTGCTTTCTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

NMIIIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIilllllllllllllllllllllllllllllll 
CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
Ifi III MM II 111 MM MM MM MM MMtMl M Hi I I I 1 M i I M I I I I M 1 1 1 I 1 



GTTATGTCACTCTGGTAGCTAGAGATCCCTCAGATCCTTTTAGGCA — GT GGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 

650 660 670 680 6?0 X 

CCGAACAGGGAC-- TTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 
lllilimill lllillllllll llllllllllllllllllllll 

CCGAACAGGGACCTCTGAAAGCGAAAGAGAAACCAGAGGAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAG 
640 650 660 670 680 690 700 710 

CGCGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



5. RAILEY-000-716.SE9 (1-696) 

N60365 Sequence of LAV virus genone . 

ID N60365 standard; cDNA; 9193 BP. 

AC N60365; 

DT 20-AUG-1991 (first entry) 

DE Sequence of LAV virus genone . 

KW AIDS vaccine? diagnosis; immunoassay; HIV; HTLV-III; ss. 

OS Lypiphadenopathy virus. 

FH Key Location/Qualifiers 

FT CDS 312.. 1838 

FT /*tag= a 

FT /product= gag 

FT CDS 1631.. 4642 

FT /*tag= b 

FT /product= pol 

FT CDS 4554.. 5165 

FT /Hag= c 

FT /product= ORF Q 

FT CDS 5746.. 8352 

FT /Hag= d 

FT /product= env 

FT CDS 8324.. 8974 

FT /*tag= e 

FT /product= ORF F 

FN y08602383-A. 

PD 24-APR-1986. 

PF 18-0CT-1985; E00548. 

PR 18-0CT-1984J FR-016013. 

PR 16-N0V-1984; GB-029099. 

PR 21-JAN-1985; GB-001473. 

PA (CNRS ) CNRS CENT NAT RECH SCI. 

PA (INSP ) INST PASTEUR. 

PI Hontagnier Lf Krust Bi Chanaret St Clavel Ft Chernann J-Ci 

PI Barre-Sinoussi F, Alizon H> Sonigo Pi Steuart Cr Danos 0» 

PI Hain-Hobson S. 

DR yPI; 86-119166/18. 

DR P-PSDB; P60419, P60420, P60421, P60422, P60423. 

PT Purified glycoprotein and peptide(s) - are recognised by sera contg 

PT antibodies against lypiphadenopathy virus and useful in detecting 

PT AIDS antibodies or in vaccines 

PS Disclosure; Fig 4; 75pp; English. 

CC The inventors claia a polypeptide which is recognised by sera of 

CC huflan origin contg. antibodies against the virus of 

CC lypiphadenopathies (LAV) or acquired icmune deficiency syndrofle 

CC (AIDS). Also claimed are various peptides corresp. to the AA 

CC sequences deducible froa proteins encoded by LAV DNA» defined by 

CC specific residues (e.g. 12-32i 37-46* 49-79, 88-153) in accordance 

CC with a formula given in the specification. 

SQ Sequence 9193 BP; 3278 A; 1652 C; 2216 G; 2047 T; 

Initial Score = 554 Optimized Score = 554 Significance = 33.93 
Residue Identitu = 99X Matches = 554 Hi etches = 4 



0 Conservative Substitutions 



0 



X 10 20 

GGGGGACTGGAAGGGCTAATTC 

IIIIIIIIIIIIIIIIIIIIII 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
8590 8600 8610 8620 8630 8640 8650 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

llliilillllllllliiillllfllllMlllllllilliliMIIIMilMlllllilllllltlilil 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 
8660 8670 8680 8690 8700 8710 8720 

100 110 120 130 140 ISO 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

iillllillilMllllllMIIMIIiliMiiiiililMIMMIIIilliilllMIMiilllilll 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 
8730 8740 8750 8760 8770 8780 8790 8800 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 
8810 8820 8830 8840 8850 8860 8870 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

ilillMMIIIilliMiilinilMllllililiilllllllililMliMillllliiMIIIIIII 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 
8880 8890 8900 8910 8920 8930 8940 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

llllllll llllllllllfllllllllllllllllllllilllllllllllllllllllll lllllllll 

TGCATCCGCAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
3950 8960 8970 8980 8990 9000 9010 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

illllilliiillill II !Mllil!ll!lllII!liillll!!ll!l!lll!ll!lllllllllllll! 

GGAGGCGTGGCCTGGGGGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 
9020 9030 9040 9050 9060 9070 9080 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 
9090 9100 9110 9120 9130 9140 9150 9160 

530 540 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

iimiiiiimiiiiiiiimiiiiiiii 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCA 
9170 9180 9190 X 

600 
AGATCCCTCA 



6. RAILEY-000-716.SEQ (1-696) 

N60288 Sequence of the HTLV-III genone. 

ID N60288 standard; DNA; 9213 BP. 
AC N60288; 

DT 08-JUN-1991 (first entru) 



DE Sequence of the HTLV-I II genone, 
KH HIV; LAV; AIDS; diagnosis; vaccine; ss. 
OS HTLV-IIIB/H9 cells IATCC CRL 8543). 
H Key Location/Qualifiers 
repeat_region 1. .96 
/Hag= a 

nisc_feature 97. .183 
/*tag= b 

/label= unique region 
CDS 336.. 731 

/Hag* c 
/produci= gag 
CDS 732.. 1772 

/Hag* d 
/product= p24gag 
CDS 1639.. 4677 

/Hag* e 
/product* pol 

CDS 4622.. 5200 

/Hag* f 
/product= p' 

CDS 5802.. 7335 

/Hag* g 
/product* env 

CDS 7336.-8373 
/Hag* h 
/product* gp41env 
CDS 8375.. 8995 

/Hag* i 
/product* E' 

p»isc_feature 8662.. 9117 
/Hag* j 

/label* unique region 
repeat_region 91 18.. 9213 
/Hag* k 

polyA_signal 9090. .9095 
/♦tag* I 

polyA_signal 9190.. 9195 
/Hag* fi 
N EP-187041-A. 
D 09-JUL-1986. 
PF 23-DEC-1985; 309454. 
PR 24-DEC-1984; US-635272. 
PR 04-DEC-1985; US-S05069. 
PA (GETH ) GENENTECH INC. 
PI Capon DJ> Lasky LA; 
DR WPI; 86-177602/28. 

DR P-PSDB? P60309, P61507, P61504i P61514, P61515. 

PT Acquired income deficiency syndrone polypeptide (s) - obtd. by 

PT Rolecular cloning etc. and used for diagnosis and in vaccines 

PT against virus disease 

PS Exanple; fig 2; 125pp; English. 

CC A conparison of N60287 with the cDNA of the HTLV-III genone 
CC revealed one particular clone, designated p7. 1 1 which contained a 
CC DNA sequence encoding this peptide (P60308) sequence. This approx. 
CC 2.2 kilobase covers the precursor gag region and encodesi 5' to 3 r r 
CC p-12» p-15i p-24 a second p-15 protein^ and approx. 300 extra base 
CC pairs 3' to the gag region (see N60288) . 

SO Sequence 9213 BP; 3297 A; 1656 C; 2217 G; 2043 T; 

Initial Score = 547 Optimized Score = 547 Significance = 33.48 
Residue Identity = 98X Hatches = 547 Hisnatches = 10 

Gaps = 0 Conservative Substitutions = 0 



X 10 20 

GGGGGACTGGAACCGCTAATTC 



iimiiimiiiiiiiim 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
8610 8620 8630 8640 8650 8660 8670 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

MllilillliMlliMIIMiltllitlliillllMIMilllllMillllliltlllililililii 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 
8680 8690 8700 8710 8720 8730 8740 8750 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

lllllllllfill Ifllll IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
ACTACACACCAGGACCAGGGATCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

8760 8770 8780 8790 8800 8810 8820 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

llllllllllllllllllllll llllllllllllllllllllllllllllllllll lllillllllllll 
CAGATAAGGTAGAAGAGGCCAACAAAGGAGAGAACACCAGCTTGTTACACCCTGTGACCCTGCATGGAATGG 
8830 8840 8850 8860 8870 8880 8890 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

lllllll llllllllllllllllllllllllllllilllllllllllllllllllllllllllllllllll 

ATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 
8900 8910 8920 8930 8940 8950 8960 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

lllllllllllllllllllllllllllll llllllllllllllllllllllllllllllll illllllll 
TGCATCCGGAGTACTTCAAGAACTGCTGATATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
8970 8980 8990 9000 9010 9020 9030 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

ilimillilimilll lllllllllllllllllllllllllllllllllllllllllllllllllill 
GGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 
9040 9050 9060 9070 9080 9090 9100 9110 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

lllllllllllllllllllllllllllll lllllillllllllllllllllllllll lllllllllllll 
TGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGAGAACCCACTGCTT 
9120 9130 9140 9150 9160 9170 9180 

530 540 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

lllllllllllilllllllllllllllllll 
AAGCCTCAATAAAGCTTGCCTTGAGTGCTTC 

9190 9200 9210 X 

600 

AGATCCCTC 



7. RAILEY-000-716.SE0 (1-696) 

N60476 Sequence of lynphadenopathy-associated virus (LAV) 

ID N60476 standard? cDNA; 9083 BP. 

AC N60476; 

DT 24-AUG-1991 (first entry) 

DE Sequence of lypjphadenopathy-associated virus (LAV) genoRe in lanbda- 

DE J19. 

KU HTLV-III; hunan T-cell leukenia/lynphcma virus type III; ARV> AIDS; 

KW associated retrovirus; HIV; ARC; orobe; diagnosis; ss. 



OS LyRphadenopathy-associated virus. 

PN U08601827-A. 

PD 27-NAR-1986. 

PF 19-SEP-1955? 007200. 

PR 19-SEP-1984; GB-023659. 

PA UNSP ) INST PASTEUR. 

PA (CNRS ) CENT NAT RECH SCIENTIFIQU. 

PI Alizon Hi Barre Sinoussi Fr Sonigo Pi Tiollais P r Cheraann JCi 

PI Hontagnier L» Wainhobson S; 

DR UPll 86-094080/14. 

PT Cloned DNA contg. fragnent hybridised aith genonic RNA or LAV - 

PT used for detection of lyRphadenopathy-associated virus 

PS Disclosure; Fig 4-1 1 ? 24ppJ English. 

CC THe inventors clain a DNA SQ which is hybridizable with the genomic 

CC RNA of the LAV viruses. Specifically claiRed are SSs which code for 

CC the envelope proteins* polymerase and core proteins. Also claiRed 

CC is a probe for the in vitro detection of LAV. N60476 was prepd. 

CC frofl virions fron FR8f an immortalized pernanent LAV producing B- 

CC lynphocyte line. 

S8 Sequence 9088 BP? 3257 A? 1624 Ci 2185 G; 2022 T; 

Initial Score = 542 Optinized Score = 542 Significance = 33.17 
Residue Identity = 99X Hatches = 542 Hisnatches = 1 

Gaps = 0 Conservative Substitutions = 0 

X 10 20 

GGGGGACTGGAAGGGCTAATTC 

iiiiimiiimimiiii 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
8500 8510 8520 8530 8540 8550 8560 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

llliMMilli!|l!IIMIIIillll!llliiillllll!iilllllil!i!illllllltllllMMI 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 
8570 8580 8590 8600 8610 3620 8630 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

illililMiiiMMililllllMlMMMIiilMIIMiMiitllMIIIIMlllllllllllll 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 
8640 8650 8660 8670 8680 8690 8700 8710 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

IIIIIIIMilllllllMiillllllllllllllllllilllllllllllillNI 11111111111111 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGACCCTGCATGGAATGG 
8720 8730 8740 8750 8760 8770 8780 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

MlllMilliMilllNMNilllMIIIMIMiillllliliMIIIIMIIIIINMIIIIMII 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 
8790 8800 8810 8820 8830 8840 8850 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

IIIMilllliilMIIMIMilillilMllillllllilMlilllilllllillllllllllllllli 
TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 
8860 8870 8880 8890 8900 8910 8920 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

IIMIIIIIIMIIIIIiMlltlllllllililMIMMtlllllllilMlllliMiliillilllll 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 
8930 8940 8950 8960 8970 fi980 8990 



460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

illiliilMliliililliiiliililliliilliiliiiiililiilillilllllllllllllllllll 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 
9000 9010 9020 9030 9040 9050 9060 9070 

530 540 X 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTA 

illllllllllllllll 
AAGCCTCAATAAAGCTT 

9080 X 



8. RAILEY-000-716.SE9 (1-696) 
Q15226 HIV-1 TAT nRNA. 

ID 015226 standard; nRNA; 1833 BP. 

AC 015226; 

DT ll-NAR-1992 {first entry) 

DE HIV-1 TAT nRNA. 

KM Retrovirus; treatment; oligonucleotide; anti-sense; binding; ss. 

OS Synthetic. 

PN H09118004-A. 

PD 28-N0V-1991. 

PF 22-APR-1991; U02734. 

PR ll-HAY-1990; US-521907. 

PA (ISIS-) ISIS PHARH INC. 

PI Ecker DJ; 

DR MPI; 91-369176/50. 

PT Anti-sense DNA capable of binding HIV virus TAT nRNA in hunan 

PT cells - for treatment of retroviral disease e.g. AIDS 

PS Disclosure; Fig 1; 24pp; English. 

CC The oligonucleotides represented in 015220-25 are capable of 

CC binding at least a portion of tat nRNA of HIV. They can be used to 

CC treat HIV and other hunan retroviruses. It is partic. effective 

CC therapeutically because particular sites of the RNA of HIV or other 

CC RNA are targeted e.g. the tat nRNA. 

S6 Sequence 1833 BP; 525 A; 408 C; 510 G; 390 U; 

Initial Score = 541 Optimized Score = 545 Significance = 33.10 
Residue Identity = 732 Matches = 546 Hisnatches = 29 

Gaps = 1 Conservative Substitutions = 0 

X 10 20 

GGGGGACTGGAAGGGCTAATTC 

limillllllllllllllll 

CAAUGACUUACAAGGCAGCUGUAGAUCUUAGCCACUUUUUAAAAGAAAAGGGGGGACUGGAAGGGCUAAUUC 
1210 1220 1230 1240 1250 1260 1270 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

l!lllllll!lllllllll!MllII!!l!Il!!ill!lil!!!lllllll!!l!!lllll!llll Mill 

ACUCCCAACGAAGACAAGAUAUCCUUGAUCUGUGGAUCUACCACACACAAGGCUACUUCCCUGAUUAGCAGA 
1280 1290 1300 1310 1320 1330 1340 1350 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

mimmiiiiiiiiii iiiiiiiiiiiiiMiiiiiiiiiuiiiiiiiiiiiiiiiiiiiiiiiii 

ACUACACACCAGGGCCAGGGAUCAGAUAUCCACUGACCUUUGGAUGGUGCUACAAGCUAGUACCAGUUGAGC 
1360 1370 1380 1390 1400 1410 1420 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

mi in iiiiiii him iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiinii illinium 

CAGAGAAGUUAGAAGAAGCCAACAAAGGAGAGAACACCAG 



1430 1440 1450 1460 1470 1480 1490 



240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

IIIIIII llllllllllll 

AUGACCCGGAGAGAGAAGUGUUAGAGUGGAQGUUUGACAGCCGCCUAGCAUUUCAUCACAUGGCCCGAGAGC 
1500 1510 1520 1530 1540 1550 1560 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

llllllllllllinillllllilllllillllllllllllllllllMlllllllllllil IIIIIIIII 
UGCAUCCGGAGUACUUCAAGAACUGCUGACAUCGAGCUUGCUACAAGGGACUUUCCGCUGGGGACUUUCCAG 

1570 1580 1590 1600 1610 1620 1630 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

imiiiimmimi iiMiiiiiiiiiiifiiniiiii imiimiiiiiimmimi 

GGAGGCGUGGCCUGGGCGGGACUGGGGAGUGGCGAGCCCUCAGAUCCUGCAUAUAAGCAGCUGCUUUUUGCC 
1640 1650 1660 1670 1680 1690 1700 1710 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 iiimiimiimiiimiiiii iiimmimi 

UGUACUGGGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUAAGGAACCCACUGCUU 
1720 1730 1740 1750 1760 1770 1780 

530 540 550 560 570 X 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCT-TCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTA 

immmimiiiiimmmi mi i 

AAGCCUCAAUAAAGCliUGCCUUGAGUGCUGUCAAAAAAAAAAAAAAAAAA 
1790 1800 1810 1820 1830 X 

600 610 620 

GAGATCCCTCAGACCCTTTTAGTCAGTG 



9. RAILEY-000-716.SEQ (1-696) 

N71016 Sequence of LAV/HTLV III envelope gene (env). 

ID N71016 standard; DNA; 4020 BP. 

AC N71016? 

DT 23-APR-1991 (first entry) 

DE Sequence of LAV/HTLV III envelope gene (env). 

KH Glycoprotein gp 110; gp 41? AIDS vaccine? diagnosis; ss. 

OS LAV/HTLV III. 

FH Key Location/Oualif iers 

FT CDS 487.. 3072 

FT /Uag= a 

FT /note= W A recombinant virus contg. this S§ is 

FT claimed" 

PN H08702038-A. 

PD 09-APR-1987. 

PF 24-SEP-1986? 022987. 

PR 25-SEP-1985J US-779909. 

PR 27-HAR-1986; US-842984. 

PR 09-SEP-1986? US-905217. 

PA (0NC0-) ONCOGEN. 

PA (HUSS/) HU S L. 

PI Hu SL, Purchio AF, Madisen Li 

DR HP I; 87-108683/15. 

DR P-PSDB? P70665. 

PT Ney recombinant viruses for directing expression of peptide(s) 

PT etc. - useful in vaccines for protecting hunans against AIDS' 

PT caused by LAV/HTLV III 

PS Disclosure; Fig 2? 165pp; English. 

?C Recombinant Ac-NPV carruina the chimeric LAV/HTLV III env oene was 



CC used to infect Sf9 cells in tissue culture. The proteins produced on 

CC cultivation uere ianunoreactive with AIDS patient serun as yell as 

CC yith nonoclonal antibodies which define LAV/HTLV III envelope 

CC glycoproteins gp. 110 and gp. 41. A recombinant DNA vector 

CC comprising ps-env li2»5 or7 pv-gagh pAc-gagl or pAc-env 5r is 

CC claifted. 

SQ Sequence 4020 BP; 1352 A; 734 C? 990 G; 944 T; 

Initial Score = 541 Optinized Score = 541 Significance = 33,10 
Residue Identity = 99X Matches = 541 Misnatches = 4 

Gaps = 0 Conservative Substitutions = 0 

X 10 20 

GGGGGACTGGAAGGGCTAATTC 

lllllllllimillllllll 

CAATGACTTACAAGGCAGCTCTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
3430 3440 3450 3460 3470 3480 3490 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

!!!lllllll!l!!l!!llll!lll!ll!ll!IIIIIIIillli!lllflllllllli!IIIIIIIIillll 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 
3500 3510 3520 3530 3540 3550 3560 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

IliilMilMlliMlllliililiMIIMMIMMIIIIiiMIMIMIIiMMilllllllilll 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 
3570 3580 3590 3600 3610 3620 3630 3640 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

!MMMIIII!iMli!iiiil!!!iMi!Mili!INi!lllililllilNIIII!MIMIiiliii 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 
3650 3660 3670 3680 3690 3700 3710 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 i 1 1 1 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTACGATTTCATCACGTGGCCCGAGAGC 
3720 3730 3740 3750 3760 3770 3780 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

MI!!lillllllllilll(l!!IMIIIIIIlllll[||l!ll!lll!lllllllliilII iililllll 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
3790 3800 3810 3820 3830 3840 3850 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

miiimumimi iiiiiiifiiiiiiiiiiiiiiiiiiiiiiiiiiifiniiiiiiiiiiiii 

GGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 
3860 3870 3880 3890 3900 3910 3920 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

IIMIIIIIIIMIIIIMMMMMIiillillilllllliliilMillillllllltlMllllllit 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 
3930 3940 3950 3960 3970 3980 3990 4000 

530 540 X 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAAC 

lllllllllllllllllll 
AAGCCTCAATAAAGCTTGC 

4010 4020 



10. RAILEY-000-716.SEQ (1-696) 

N80436 Entire sequence of LAV EL I 



ID M80436 standard; cDNA; 9236 BP. 
AC N80436? 

DT 16-DEC-1990 (first entry) 
DE Entire sequence of LAV EL I 

KW HIV; HTLV III; AIDS? diagnosis; vaccine; probe; hybridisation? ss. 
OS Lyaphadenopathy associated virus EL I. 
H Key Location/Qualifiers 
Risc_feature 1..98 
/Hag= a 
/label=R 

nisc_feature 99.. 182 
/Hag= b 
/label=U5 

nisc_feature 8683.. 9138 
/Hag= c 
/label=U3 

nisc_feature 9139.. 9236 
/Uag= d 
Mabel=R 

CDS 336.. 1835 

/atag= e 

/label=GAG, P80884 
CDS 1634.. 4699 

/Hag= f 

/label=P0Lr P81854 
CDS 4647.. 5222 

/Hag= g 
/label=Q, P81855 
CDS 5165.. 5452 

~ /Hag= h 
/label=R, P81856 
CDS 5436.. 5651 

/Hag= i 
/label=S. P81857 
CDS 5830.. 8388 

/Hag= j 

/label=ENV, P81858 
CDS 8393. .9010 

/Hag= k 
/label=F, P81859 
>N H08707906-A. 
PD 30-DEC-1987. 
PF 22-JUN-1987? E00326. 
PR 23-JUN-1986? EP-401380. 
PA (INSP) Inst Pasteur. 

PI Alizon Mi Sonigo Pi Hain-Hobson S» Rontagnier L? 
DR WPI; 88-014396/02. 

DR P-PSDB? P80884, P81854, P81855, P81856, P81857r P81858, P81859. 
PT New variants of lyaphadenopathy associated virus (LAV) - 
PT used for prodn. of DNAi antigens and antibodies used in 
PT diagnosis of AIDS and pre-AIDS 
PS Claifl 3; Fig 7A-7J? 72pp? English. 

CC LAV EL I (n80436) and LAV HA L (n80437) were isolated froin the peripheral 

CC blood lymphocytes of patients. The different AIDS virus isolates 

CC are designated by 3 letters of the patients nane. Stable probes including 

CC the DNA sequences can be used for detection of the new LAV viruses or 

CC related viruses or DNA proviruses in eg biological sanples. The proteins 

CC or peptides can be used for detection of antibodies induced in vivo and 

CC present in biological fluids. The DNA can also be used for the expression 

CC of LAV viral antigens for the prodn. of a vaccine against LAV. The 

CC polypeptides can also ( be used for the prodn. of antibodies for the 

CC detection of oroteins related to the LAV viruses, oartic. for diaanosis 



CC of AIDS or pre-AIDS. 

S8 Sequence 9236 BP; 3360 A; 1642 C; 2190 G; 2044 T, 



Initial Score = 502 Optimized Score = 502 Significance = 30.62 
Residue Identity = 89X Hatches = 502 Hisnatches = 57 

Gaps = 0 Conservative Substitutions = 0 

X 10 20 

GGGGGACTGGAAGGGCTAATTC 

lllllllllllllllllllll 
CAATGACTTACAAAGAAGCTCTAGATCTCAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTT 

8630 8640 8650 8660 8670 8680 8690 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

iii ii iiiiiiii minimi iii iiiii minimi minimum i 

GGTCCAAAAAGAGACAAGAGATCCTTGATCTTTGGGTCTACAACACACAAGGCATCTTCCCTGATTGGCAAA 
8700 8710 8720 8730 8740 8750 8760 8770 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

mmimiiiiiimi iimimim mmiiiimiim immmiimi i 

ACTACACACCAGGGCCAGGGATCAGATATCCACTAACCTTTGGATGGTGCTACGAGCTAGTACCAGTTGATC 
8780 8790 8800 8810 8820 8830 8840 

170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

ii i iiiiiimi i ii i imiiiii ii miimimiiii i iii imimm 

CACAGGAGGTAGAAGAAGACACTGAAGGAGAGACCAACAGCTTGTTACACCCTATATGCCAGCATGGAATGG 
8850 8860 S870 8880 8890 8900 8910 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

i mil mm mum i mil m inn i iiimm i m Milium 

AGGACCCGGAGAGACAAGTGTTAAAATGGAGATTTAACAGCAGACTAGCATTTGAGCACAAGGCCCGAGAGA 
8920 8930 8940 8950 8960 8970 8980 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

1 1 1 f i i 1 1 1 1 1 1 ii m mil tiiii mini 1 1 1 m f ii 1 1 1 1 1 1 1 1 1 1 1 m i mmiii 

TGCATCCGGAGTTCTACAAAAACTGATGACACCGAGCTTTCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
8990 9000 9010 9020 9030 9040 9050 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

miiuiii IIIIIIII 1 1 1 f 1 1 i i 1 1 1 f I I I i 1 1 1 1 1 1 1 1 ! t f 1 1 1 ! I f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GGAGGCGTGGACTGGGCGGGACTGGGGAGTGGCTAACCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 
9060 9070 9080 9090 9100 9110 9120 9130 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

iniiuiiiiiiiuiiiiiiiimmmmmmmmuiu miimiummi 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAGCTAGGGAACCCACTGCTT 
9140 9150 9160 9170 9180 9190 9200 

530 540 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

i 1 1 M 1 1 1 1 1 1 1 f 1 1 1 1 M 1 1 1 it 1 1 1 1 1 1 II 1 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAA 
9210 9220 9230 X 



600 

AGATCCCTCAG 



> 0 < 

0| |0 IntelliGenetics 

> 0 < 

FastDB - Fast Pairaise Conparison of Sequences 
Release 5.4 

Results file railey-000-716.res fiade by shears on Hon 26 Apr 93 15:30:46-PDT. 



Query sequence being coftpared:RAILEY-000-716.SEQ (1-696) 
Number of sequences searched: 128494 
Number of scores above cutoff? 4938 



Results of the initial comparison of RAILEY-000-716.SEQ (1-696) uith: 



Data bank 
Data bank 
Data bank 
Data bank 



EMBL-NEW 2, all entries 
GenBank 75» all entries 
GenBank-NEW 2, all entries 
UEMBL 33 75, all entries 



100000- 
N 

U50000- ft 
M -ft 
B . -ft 
E 
R 



F10000- 



E 5000- ft 



U 
E 
N 
C 
E 

S 1000- 



500- 



ft 



ft 



100- 



ft 



SO- 



ft ft 



10- * * 

a 

* * * * 

5- * 4 * * 

****** ** 
** * * *** * ** ** 



******** * * * 
0 

i i mi mi i i i i i i i 

SCORE 0 | | 76 | | 151 227 303 378 454 530 605 681 
STDEV -15 9 



PARAMETERS 



Similarity natrix 


Unitary 


K-tuple 


4 


Mi snatch penalty 


1 


Joining penalty 


30 


Gap penalty 


1.00 


Window size 


32 


Gap size penalty 


0.33 






Cutoff score 


0 






Randomization group 


0 






Initial scores to save 40 


Alignnents to save 10 




Optimized scores to 


save 0 


Display context 50 






SEARCH STATISTICS 




Scores** 


Mean 


Hedian Standard Deviation 




30 


30 12.19 




Tides; 


CPU 


Total Elapsed 






00:44:53.05 


01:01:44.00 




Nunber of residues' 




154807074 




Number of sequences 


searched: 


128494 




Number of scores above cutoff: 


4938 





Cut-off raised to 24. 

Cut-off raised to 28. 

Cut-off raised to 31. 

Cut-off raised to 34. 

Cut-off raised to 37. 

Cut-off raised to 40. 

Cut-off raised to 43. 

Cut-off raised to 46. 

Cut-off raised to 48. 

Cut-off raised to 51. 

Cut-off raised to 53. 

Cut-off raised to 56. 

The scores belou are sorted by initial score. 
Significance is calculated based on initial score. 

A 100% identical sequence to the query sequence was not found. 



The list of best scores is: 



Seouence Napie Description 



Init. Opt. 

Lenoth Score Score Sic, Fr^e 









53 standard deviations 


above nean *«** 








1. 


HIVPV22 


Hunan 


innunodef iciencu virus 


9770 


681 


684 

mm mm 9 


53.41 

Pb* mm W 9 p> 


0 

w 








52 standard deviations 


above nean *»** 










HIVHXB2CG 

W P B> F ~ . * ■ BB* BP" ^B* BJ 


Hunan 


innunodef iciencu virus 

B> V ■ P ■ W i * BV V BP VJP BV 1 V *JV BVJ W BB ■ TJBF pjr 


9718 


664 


671 

mm M b> 


52.01 

W pjb. 1 V 4 


o 


3. 


REHTLV3 


Hunan 


T-cell leukaenia tupe I 

V ^p- P* ■ B VP" F V- PJP- * BP BB> PM PAB WF 1 Bl 


9748 


664 


671 

mm m m 


52.01 

Pb »bf V V B> 


o 


4. 


HIVH3CG 


Hunan 


T-cell luftohotroDic vir 

• ^ ■ bj ■ Bfl ■ i k> * * tpt bp p «pr «> mm* * a 1 


9749 


664 


671 

mm 4 aV 


52.01 

W P> I V I 


o 

w 
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1. RAILEY-000-716.SE9 (1-696) 

HIVPV22 Hunan innunodef iciency virus type lr isolate PV22j 

LOCUS HIVPV22 9770 bp ss-RNA VRL 15-MAR-1990 
DEFINITION Hunan innunodef iciency virus type li isolate PV22> conplete genone 

(H9/HTLV-III proviral DNA). 
ACCESSION K02083 
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polyprotein; proviral gene; rev protein; reverse transcriptase; 
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This sequence for a H9/HTLV-III virus uas deterfiined froR one 
coRplete proviral clone CI 3, Additionally, several cDNA clones of 
the viral RNA were sequenced for coRparison uith the entire 
proviral sequence. The differences betueen cDNA and proviral DNA 
are extensive and are listed in the Sites Table as variations. The 
authors believe that the variations Ray be due in part to different 
strains in the H9/HTLV-III cell line, because it was established by 
infection uith Raterial froR several AIDS patients. 

With the addition of g at 2111, gag cds and pol cds are very close 
to those of HXB2, BRU, and related HIV viruses. 
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EHSLPGRUKPKHIGGIGGFIKVReYDaiLIEICGHKAIGTVLVGPTPVNIIGRNLLTS 
IGCTLNFPISPIETVPVKLKPGNDGPKVK8HPLTEEKIKALVEICTEMEKEGKISKIG 
PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRT8DFHEV8LGIPHPAGLKKKKSVTVL 
DVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRY8YNVLP8GWKGSPAIF8SSMTKI 
LEPFRK8NPDIVIY8YHDDLYVGSDLE1G8HRTKIEELR8HLLRWGLTTPDKKH8KEP 
PFLWHGYELHPDKWTV@PIVLPEKDSHTVNDI8KLVGKLNWAS8IYPGIKyR8LCKLL 
RGTKALTEMIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEI8K8G8G8MTY8I 
YQEPFKNLKTGKYARHRGAHTNDVK8LTEAV8KITTESIVIHGKTPKFKLPI8KETWE 
TyfciTEYyQATBIPFyEFVNTPPLVKL yYM FKFPIVrcAFTFYVnnAANRFTRLGKAGY 



LTNKGR6KVVPLTNTTN8KTEL8AIYLAL8DSGLEVNIVTDS8YALGII8A6PD8SES 
ELVN9IIE8LIKK8KVYLAHVPAHKGIGGNE8VDKLVSAGIRKILFLDGIDKA8DEHE 
K YHSNURAH ASDFNLPP WAKE I VASCDKC9LKGEAHHGQVDCSPG I W8LDCTHLEGK 
VILVAVHVASGYIEAEVIPAETGSETAYFLLKLAGRHPVKTIHTDNGSNFTSATVKAA 
CHHAGIKfiEFGIPYNPeS8GVVESMNKELKKIIG8VRD9AEHLKTAV9MAVFIHNFKR 
KGGIGGYSAGERI VDI I ATDIQTKEL9K0ITK1SNFRVYYRDSRNPLMKGPAKLLWKG 
EGAVV I QDNSDI K VVPRRKAK I I RDYGK8H AGDDCVASRQDED ■ 
CDS 5086.. 5664 

/note= n vif protein" 
/codon_start=l 

/tr3nslation= u HENRH9VMIVH8VDRMRIRTHKSLVKHHHYVSGKARGHFYRHHY 
ESPHPR I SSEVH I PLGDARL V I TT YBGLHTGERDHHLG8GVS I EHRKKRYST8VDPEL 
ADQL I HLY YFDCFSDSA I RKALLGH I VSPRCEY8 AGHNKVGSLBYL ALAAL I TPKKIK 
PPLPSVTKLTEDRHMKPSKTKGHRGSHTRNGH" 
CDS 5604.. 5840 

/note=°vpr protein" 
/codon_start=l 

/trans 1 at ion=°HE8APEDQGPQREPHNEWTLELLEELKNEAVRHFPRIHLHGLG8 
HIYETYGDTWAGVEAIIRIL88LLFIHF8NWVST" 
CDS 6107.. 6352 

/note="vpu protein" 
/codon_start=l 

/translation=°H8PI8IAIVALVVAIIIAILVWSIVIIEYRKILR8RKIDRLIDR 
L I ER AEDSGNESEGE ISALVEMGVEMGHHAPHD VDDL " 
CDS 6267.. 8837 

/note="envelope polyprotein" 
/codon_start=l 

/translation=''MRVKEKYQHLHRUGHRHGTi i lLLGMLMICSATEKLUVTVYYGVPV 
WKEATTTLFCASDAKAYDTEVHNVHATHACVPTDPNPBEVVLVNVTENFNMMKNDMVE 
QMHEDI ISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRNIHEKGEIKNCSFN 
ISTSIRGKVQKEYAFFYKLDI IPIDNDTTSYTLTSCNTSVIT8ACPKVSFEPIPIHYC 
APAGFA I LKCNNKTFNGTGPCTNVSTV8CTHGI RPVVST6LLLNGSL AEEEVVI RSAN 
FTDNAKT1 I V8LN8SVE I NCTRPNNNTRKS I RI 9RGPGRAFVT IGK I GNHR8AHCNIS 
RAKHNNTLKQIDSKLREQFGNNKTIIFK8SSGGDPEIVTHSFNCGGEFFYCNST8LFN 
STWFNSTHSTEGSNNTEGSDTITLPCRIKQFINMH9EVGKAMYAPPISGQIRCSSNIT 
GLLLTRDGGNNNNESEIFRPGGGDHRDNHRSELYKYKVVKIEPLGVAPTKAKRRVV8R 
EKRAVGIGALFLGFLGAAGSTNGAASNTLTV8AR8LLSGIV888NNLLRAIEA88HLL 
8LTVHGIK8L8ARILAVERYLKD88LLGIHGCSGKLICTTAVPHNASUSNKSLE8IHN 
NMTHREHDREINNYTSL I HSL I EES8N88EKNE8ELLELDKHANLWNHLN I TNULUY I 
KLFIHIVGGLVGLRIVFAVLSIVNRVR8GYSPLSF8THLPTPRGPDRPEGIEEEDGER 
DRDRSIRLVNGSLALIMDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGHEALKYHHN 
LL8YyS8ELKNSAVSLLNATAIAVAEGTDRVIEVV8GAYRAIRHIPRRIR8GLERILL 

a 

CDS 8839. .945? 

/note=°nef protein 0 
/codon_start=i 

/tran5laiion= n HGGKWSKSSVIGHPAVRERHRRAEPAADGVGAASRDLEKHGAIT 
SSNT AANNAACAWLEA8EEEKVGFPVTPQVPLRPHTYKAAVDLSHFLKEKGGLEGL I H 
SQRRQDILDLWIYHTSGYFPDyQNYTPGPGIRYPLTFGHCYKLVPVEPDKVEEANKGE 
NTSLLHPVSLHGMDDPEREVLEyRFDSRLAFHHVARELHPEYFKMC 0 

BASE COUNT 3436 a 1786 c 2376 g 2172 t 

ORIGIN 482 bp upstrean of Bglll site. 

Initial Score = 68i Optimized Score = 684 Significance = 53.41 
Residue Identity = 98X Hatches = 685 HisRatches = 9 

Gaps = 2 Conservative Substitutions = 0 

X 10 20 30 40 50 60 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACAC 

! II! IIIMI!!l!llll!ill!IIII!lilllllllllll!ll!lllll!lllllll!fll||| 

TGTAGTGGG-- TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACAC 
X 10 20 30 40 50 60 70 



70 80 90 100 110 120 130 140 

ACAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATG 
tf 11 II II MM Illl 1111 il IN Nil II Ml! ill! II MM M M M II M M ! M M M M II M 



ACAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGACCAGGGATCAGATATCCACTGACCTTTGGATG 
30 90 100 110 120 130 140 



150 160 170 180 190 200 210 

GTGCTACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTT 

IIIIIillllMIIIIIIIIIIIIIIIfllMIIIIIIIIIIIIIIII lllllllllllllllllllllll 
GTGCTACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAACAAAGGAGAGAACACCAGCTTGTT 

150 160 170 180 190 200 210 

220 230 240 250 260 270 280 

ACACCCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCT 

IIIIIIIIIIIIIMfllllllllllllllNI illlllllilllllllllllllllllllllllllllll 

ACACCCTGTGAGCCTGCATGGAATGGATGACCCGGACAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCT 
220 230 240 250 260 270 280 

290 300 310 320 330 340 350 

AGCATTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAA 

lllllllll!llll!llll!lllllll!llllllll!ll!!!lllllll!lllll 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 

AGCATTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGATATCGAGCTTGCTACAA 
290 300 310 320 330 340 350 

360 370 380 390 400 410 420 

GGGACTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATG 

minimum! imiiiimmmmmimi immiimiimimiiiii 

GGGACTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATG 
360 370 380 390 400 410 420 430 

430 440 450 460 470 480 490 500 

CTGCATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCT 

iiiiiifiiiiiiiiiiiiiiiiiiiiiimmmmiimiiiiimi iimmimim 

CTGCATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCT 
440 450 460 470 480 490 500 . 

510 520 530 540 550 560 570 

CTGGCTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCC 

IIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIlllMlllllilllllllllllllllll 

CTGGCTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCC 
510 520 530 540 550 560 570 

580 590 600 610 620 630 640 

GTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTG 

IIIIMMIIIMillllllllllllllllllllllllllllllllllllllllllllllllllllllllll 

GTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTG 
580 590 600 610 620 630 640 

650 660 670 680 690 X 

GCGCCCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

lllllllllllllllllllllllllllillllllllllimilllllim 
GCGCCCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGACGCAGGACTCGGCTTGCTGA 

650 660 670 680 690 700 710 

AGCGCGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 740 



2. RAILEY-00Q-716.SE9 (1-696) 

HIVHXB2CG Hunan innunodef iciency virus type 1 (HXB2)* conple 

LOCUS HIVHXB2CG 9718 bp ss-RNA VRL 14-JAN-1992 

DEFINITION Huaan iRRunodef iciency virus type i (HXB2)» complete genoae! 

HIV1/HTLV-III/LAV reference genone. 
ACCESSION K03455 

acquired iapiune deficiency syndrofie; coptplete genome; 
gag protein; long terninal repeat (LTR); pol protein; 
proviral gene; reverse transcriptase; trans-activator. 



KEYWORDS TAR protein 

env protein 
polyprotein 
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narkedlu reduced cutopathic effects 

* ■ w rm mm mm m> mmw ■ w mm* mmr mm- mww mw mm mm wmr mmT mm mm mmm mmr< mt % ( mm mm 


JOURNAL 

V M mm 1 ■ 1 • 1 1 mm 


Science 233, 655-659 (1986) 

WW*^llWW «»W W * WW W WW* \ A I W W 9 


STANDARD 

mm 1 1 111 mm 1 1 1 1 M 


full autonatic 

1 mm m m u Wl V V 1 I mm jmf m mm 


REFERENCE 

• ■ mm I Am * w mmm- ■ 1 if bpi 


26 (sites) 

mm mmf 1*4 V mm mf f 


AUTHORS 


HriahtrC.Mi? FelberiB.K.» PaskalisrH. and PavlakisvG.N. 

* 15 * ■ * ■ • " i » i » ■ * mm p | i p w * ip mm rl mm m m mw w 1 ■ P 1* I 1 1 ^p 1 1 U Fl A mf 9 m m m 1 11 1 


TITLE 


Expression and characterization of the trans-activator of 

mmr • » mmr % mr mw mm *%■ mf ■ 1 mm a 1 mm mm 1 • ■* ■ mm mm mm m* * m 9m mm V mm mmT % l mmf | V 1 ff V W * W IIW mmf W V p> V W W V 1 mm * 




HTLV-III/LAV virus 

■ * 1 *^ T * P> * * mww> III T m 1 «PJ w 


JOURNAL 

W W II 111 1 


Science 234t 988-992 (1986) 

W W A V 1 Ik ^ mm W I f fWW t 9 mm \ A 9 W W / 


STANDARD 

V 1 1111 BP I I V 1 mm 


full automatic 


REFERENCE 

I 1 mm ■ ^pi ■ l mm * 1 W ^» 


27 (bases 5611 to 5611) 

mmm 9 IWWflfV^ W W A A V W WW A A * 


AUTHORS 

* ■ IF • ■ I V * 1 mm 


Ratner rL. 

1 i W mw m * m> m i ^» p 


JOURNAL 

mm ^p ~ • ■ > • w ^^p 


Unpublished (1987) Washington U Hed School » St. Louisi MO 

m* i » P» l»«r » * w 4 1 » » m r mw m 9 r T IP *p I I a II mm mw mmr | f V ■ ( W ^/ mm b 1 1 ir 1^ 1 # ipr V ffj V Ipl • mf f VI pP 


STANDARD 

ftp 1 1 111 kW Pill mm 


full automatic 

■ mm % m UU VVI1UU4W 


REFERENCE 

* • m mm* * ■ 9 1 IF Pv 


28 (sites) 

1* V 1 mw m mmm mm r 


AUTHORS 


Wona-Staal i F. 1 ChandaiP.K. and GhrauebiJ. 

■ ■ mm 9 • mmm mmr mm mmm mmw mmm mj m mmt m m mm m m mmw mmm mm m m %, m mm 9 m mmr mm 1 ■! mm* mmm mw mm w mr W 


TITLE 


Hunan innunodef iciencu virus! the eiahth aene 

w mmm -w mm w m mm m w * m mmm w w mw mmw mm mm mm mm mm m m mmw mmm w mr w mmw mw mm m. m mm mm mm mMM W 9 mw W m mm mm m ■ ^» 


JOURNAL 

V 1* mm HIT! ■ mmm 


AIDS Res. Hun. Retroviruses 3i 33-39 (1987) 

1 ■ * *■ mm llW J 1 IV UP 111 VI mw ¥ A 9 \M mt W m mm 9 W W \m* * \ m $ mw I m 


STANDARD 

mm 1 VIII mm m 1 1 1 *F 


full autonatic 

■ mm A A mm mM V W 1 1 U W A mm 


REFERENCE 

* 1 mmm ■ Iw * » mm 1 1 V mmm 


29 (sites) 

mmm 9 \ mf A W W> mm f 


AUTHORS 


Pat arc a ?R.; HeathrCi GoldenberafG.J.r RosenrC.A.> SodroskivJ.G.f 

p mm mm w mm mm r w m m m mm mm mw mm www mm m w w mw m mm m> m m mm mm m mm * IP p mm w 4> 11 mw mm 1a 111 IP P P 1 P 9 mm mmf mmT | mw mf \Tm m* w Iff P IP p w 




Haseltine? W. A . and HanseniU.M. 

■ 1 m0 mm ^m m mm 4-11 1* 1 IT p I 1 P IP 1 1 IP 1 1 mm 1 V mm mr 1 1 f mm PJ W 9 P 


TITLE 


Transcription directed bu the HIV lono terninal repeat in vitro 

* maa ~ * mm mmt m m mm mm mt mm ■ i mw • ™ i* ^» mm mw IP mw • ■ ^p w 1 m w m mm I 1 www mm mr 1 i ■ m ■ | ip ft 1 ftr ^m V 1* m m 1 W m ww ff mm 


JOURNAL 

mr mm mm v > v > v ■ 


AIDS Res. Hun. Retroviruses 3t 41-55 (1987) 


3 1 nnl/nnU 


Pill 1 3it4Am*i4 i f> 

tut L auLoPiaLir 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
STANDARD 
REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
STANDARD 
COHHENT 



30 (bases 1 to 9635? 1 to 9635) 

RatneriL.r Fisher, A., Jagodzinski,L.L.» HitsuyatH., Liou,R.-S.> 
GalloiR.C. and Uong-Staal>F. 

Complete nucleotide sequences of functional clones of the AIDS 
virus 

AIDS Res. Hun. Retroviruses 3, 57-69 (1987) 
full automatic 

31 (sites) 

Huesing,H.A. , Smith, D.H. and Capon, D.J. 

Regulation of nRNA accumulation by a human imnunodef iciency virus 
trans-activator protein 
Cell 48, 691-701 (1987) 
full autoRatic 

32 (sites) 

Hodrou»S.» HahniB.H.r ShaurG.fi., Gallo»R.C, yong-Staal,F. and 
Wolf,H. 

Computer-assisted analysis of envelope protein sequences of seven 
human immunodeficiency virus isolates; Prediction of antigenic 
epitopes in conserved and variable regions 
J. Virol. 61, 570-578 (1987) 
full automatic 

33 (sites) 

Goh,y.C, SodroskirJ.G., Rosen,C.A. and Haseltine,W.A. 
Expression of the art gene protein of hunan T-lymphotropic virus 
type III (HTLV-III/LAV) in bacteria 
J. Virol. 61, 633-637 (1987) 
full automatic 

34 (sites) 

NabehG. and Baltimore, D. 

An inducible transcription factor activates expression of hunan 
iRRunodef iciency virus in T cells 
Nature 326, 711-713 (1987) 
full automatic 

35 (sites) 

Fisher, A. G., Ensoli,B., Ivanoff,L., Chamberlain, H. , Petteuay,S., 
Ratner,L., Gallo,R,C. and Hong-Staal ,F. 

The sor gene of hiv-1 is required for efficient virus transmission 
in vitro 

Science 237, 888-893 (1987) 
full automatic 

36 (sites) 

Ido,E., Han,H.-p., Kezdy,F.J. and Tang, J. 

Kinetic studies of human immunodeficiency virus type 1 protease and 
its active-site hydrogen bond mutant A28S 
J. Biol. Chem. 266, 24359-24366 (1991) 
full automatic 

C61 sites? tat nRNA and other transcript boundaries. 

[71 sites; tat nRNA. 

C83 sites? nRNA splice sites. 

[9] sites; 27K antigen cds. 

[53 sites; gpl60 and gpl20 coding sequences. 

[1] sites; regulatory sequences in the LTR. 

C tin) Weiss, R. , Teich,N., Varnus,H. and Coffin, J. (Eds.) ;RNA Tumor 

Viruses, Seconl revieu, bases 1 to 9718. 

CIS] sites; trans-activator function and TAR sequence. 

U9] sites; pol coding sequence. 

[22] sites; 23K sor gene product. 

[233 sites; pol NH2-terminal region. 

[20] sites; sor 23K protein. 

[213 sites; sor 23K protein. 

C243 sites, Spl binding sites in the promoter region. 

[173 sites; acceptor and donor splice sites for tat and 27K. [103 

sites; deletion mutants in the tat gene. 

[183 sites, env gene conserved/varable regions; separate entries. 
[163 sites; trs cds boundaries. 
[123 sites; trq rris houndarip*. 



[113 sites; env gene conserved/variable regions; separate entries. 

[26] sites; tar or transactivator target. 

[131 sites; 3' orf nutations. 

C14] sites? pol p34 terninus. 

[313 sites; promoter, TARi tat-III nutants. 

[32] sites; envelope protein epitopes. 

[33] sites; trs/art protein. 

[34] sites? inducible enhancer elenent. 

[27] revises [30]. 

[29] sites; long terRinal repeat. 

[28] sites; R orf. 

[35] sites; sor. 



Sequence for [25] kindly provided in conputer-readable forn by 
L.Ratner, 19-AUG-1986. 



FEATURES 
exon 



exon 



The HXB2 sequence is being used as a reference genone for all the 
HIV entries because it has been derived fropi a demonstrably 
infectious clone. Hence not all of the 'sites 1 references above 
were concerned with this isolate. 

Location/Qualifiers 

<5830..6044 

/nunber=2 

/note="tat protein* (first expressed exon) 0 
<5969..6044 

/nuflber=2 

/note= D trs protein^ (first expressed exon) 0 



LTR 



intron 
intron 
intron 
intron 
intron 
exon 



1.-634 
/note=°5' LTR° 
repeat_region 454. .551 

/note="R repeat 5' copy 0 
piRNA 455.. 9635 

/note=°HXB2 genome rRNA° 
prin_transcript 455.-9635 

/note= n tat, trsr 27K subgenoeic rRNA° 
743.. 5776 

/note= n tat,trs, 27K nRNA intron 1° 
6045.. 8377 

/note= n tat intron 1° 
6045.. 8377 
/note= fl trs intron 2° 
6045.. 8377 

/note= D 27K nRNA intron 2° 
6045.. 8377 

/note= n tat, trs intron 2° 
8378..>8423 
/nunber=3 

/note=°tat protein 0 
8378..>8652 
/nunber=3 

/note=°tr$ protein 0 
9085.. 9718 
/note=°3' LTR D 
repeat_region 9539. .9635 

/note= fl R repeat 3 1 copy 0 
9611. .9616 

/note= D HXB2 rRNA polyadenyation signal 0 
join(5830.. 6044,8378.. 8423) 
/note= n tat protein 0 
/codon_start=l 

/translation=°HEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALG 
ISyGRKKRRQRRRAHQNSSTHQASLSKQPTSQSRGDPTGPKE 1 ' 
join(5969.. 6044,8378.. 8652) 
/note= a trs protein 0 
/codon_start=l 

/t.ran*iTflt.ir.n= u «Af:Rt;nn<;nPPI TRTURI TKM VftqMPPPNPFfcTRAARRNRRRR 



exon 



LTR 



polyA_signal 
CDS 



CDS 



WRER9RQIHSISERILGTYLGRSAEPVPL8LPPLERLTLDCNEDCGTSGTQGVGSPQI 
LVESPTVLESGTKE 0 
CDS 78?.. 2291 

/note=°gag polyprotein 0 
/codon_start=l 

/transIation= a I1GARASVLSGGELDRMEKIRLRPGGKKKYKLKHIVWASRELERF 
AVNPGLLETSEGCR9ILG9LQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALD 
KIEEE6NKSKKKASQAAADTGHSNQVSQNYPIVQNIQGQHVHQAISPRTLNAWVKVVE 
EKAFSPEVIPHFSALSEGATPQDLNTHLNTVGGHQAAH9MLKETINEEAAEHDRVHPV 
HAGPIAPGQHREPRGSDIAGTTSTLQEfllGUIITNNPPIPVGEIYKRWIILGLNKIVRM 
YSPTS1LDIRSGPKEPFRDYVDRFYKTLRAEQAS9EVKNWMTETLLVQNANPDCKTIL 
KALGPAATLEEHHTAC0GVGGPGHKARVLAEAMSQVTNSATII1MQRGNFRNSRKIVKC 
FNCGKEGHTARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIHPSYKGRPGNFLfi 
SRPEPTAPPEESFRSGVETTTPP8K8EPIDKELYPLTSLRSLFGNDPSS8 B 
CDS 2357.. 5095 

/partial 

/note= n pol polyprotein; (NH2-terRinii5 uncertain)" 
/codon_start=l 

/iransTat ion=°MSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGP 
TPVN 1 1 GRMLLTQI GCTLNFP I SP I ETVPVKLKPGMDGPKVK6WPLTEEK I KALVEI C 
TEHEKEGK I SK I GPENPYNTP VFA IKKKDSTKNRKLVDFRELNKRTQDFWEVSLGIPH 
PAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPflGWK 
GSPAIFSSSMTKILEPFRK9NPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRSHLLRM 
GLTTPDKKHQKEPPFLWMGYELHPDKyTVQPIVLPEKDSWTVNDIQKLVGKLNWASQI 
YPGIKVR9LCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAE 
IQKGGQGSyTYQIYQEPFKNLKTGKYARflRGAHTNDVKQLTEAVSKITTESIVIHGKT 
PKFKLPI9KETyETN«TEYHSATWIPE«EFVNTPPLVKLHYQLEKEPIVGAETFYVDG 
AANRETKLGKAGYVTNRGRQKVVTLTDTTNGKTEL9AIYLALQDSGLEVNIVTDSQYA 
LG1I9A8PD8SESELVN9IIEQLIKKEKVYLAHVPAHKGIGGNE9VDKLVSAGIRKVL 
FLDG I DKA8DEHEKYHSNWRAMASDFNLPPVVAKEI VASCDKC9LKGEAMHG9VDCSP 
Giy9LDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRHPVKTIHTD 
MGSNFTGATVRAACHMAGIK9EFGIPYNP9S9GVVESHNKELKKIIG9VRDQAEHLKT 
AV9HAVF I HNFKRKGG I GGYSAGER IVDIIATDI 9TKEL9K9 I TK I 9NFRV Y YRDSRN 
SLWKGPAKLLHKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED 0 
CDS 5040.. 5618 

/note= B 5or 23K protein 0 
/codon_start ; =l 

/translation= a HENRHflVHIVHBVDRNRIRTHKSLVKHHHYVSGKARGllFYRHHY 
ESPHPRISSEVHIPLGDARLVITTYyGLHTGERDHHLG9GVSIEWRKKRYST9VDPEL 
AD9LIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSL8YLALAALITPKKIK 
PPLPSVTKLTEDRHNKP8KTKGHRGSHTHNGH D 
CDS 5553.-5794 

/note= B RJ (ORF) protein 0 
/codon_start=l 

/translation=°f1E9APED9GPQREPHNEyTLELLEELKNEAVRHFPRIHLHGLG8 
HIYETYGDTWAGVEAIIRIL89LLFIHFGNWVST 0 
CDS 6224.. 8794 

/note=°envelope polyprotein 0 
/codon_start=l 

/tran5lation=°HRVKEKY9HLMRyGHRyGT«LLG[1LMICSATEKLHVTVYYGVPV 
WKEATTTLFCASDAKAYDTEVHNVHATHACVPTDPNP8EVVLVNVTENFDMWKNDRVE 
9HHEDIISLMD9SLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRHIHEKGEIKNCSFN 
ISTSIRGKV9KEYAFFYKLDIIPIDNDTTSYSLTSCNTSVIT8ACPKVSFEPIPIHYC 
APAGFAILKCNNKTFNGTGPCTNVSTVGCTHGIRPVVST9LLLNGSLAEEEVVIRSVN 
FTDNAKTIIV9LNTSVEINCTRPNNNTRKRIRI8RGPGRAFVTIGKIGNMR9AHCNIS 
RAKWNNTLKBIDSKLRE8FGNNKTIIFKGSSGGDPEIVTHSFNCGGEFFYCNSTQLFN 
STWFNSTySTEGSNNTEGSDTITLPCRIKQI INMHQKVGKAMYAPPISGQIRCSSNIT 
GLLLTRDGGNSNNESEIFRLGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVVGR 
EKRAVGIGALFLGFLGAAGSTHGAASHTLTV9AR9LLSGIVQ8QNNLLRAIEA8SHLL 
9LTVWGIK9L8ARILAVERYLKD0GLLGiyGCSGKLICTTAVPHNASWSNKSLE6IWN 
HTTHHEHDREINNYTSLIHSLIEESQNflffiKNEflELLELDKHASLHNyFHITNHLHYI 
KLFINI VGGLVGLR I VF AVLS I VNR VRQGYSPLSF9THLP I PRGPDRPEG I EEEGGER 
DRDRS I RL VNGSLAL I HDDLRSLCLFSYHRLRDLLL I VTR I VELLGRRGWE ALK YWWN 
LL9YtlS9ELKNSAVSLLNATAIAVAEGTDRVIEVV8GACRAIRHIPRRIR9GLERILL 

CDS 5796. -9167 



/note=°27K protein! "^^!R?B^e ^termination) 8 
/codon_start=l 

/tran5lation= H flGGKHSKSSVIGWLTVRERI1RRAEPAADGVGAASRDLEKHGAIT 
SSNTAATNAACAHLEAQEEEEVGFPVTP6VPLRPHTYKAAVDLSHFLKEKGGLEGLIH 
SSRRQDILDLWIYHTQGYFPD n 

BASE COUNT 3411 a 1773 c 2370 g 2164 t 

ORIGIN 435 bp upstream of PvuII site; 5' end of proviral genone. 

Initial Score = 664 Optimized Score = 671 Significance = 52,01 
Residue Identity = 97X Matches = 673 Nisnatches = 13 

Gaps = 3 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

lllilllillllllllllllllllllllllllllllllllllllillllllllilllilllllll 
TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

X 10 20 30 40 50 60 

80 90 100 110 120 130. 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

iiiiiiiiiiiniii imimiiiiimiiiiiiiii iiiiiiiiiiiiiiiiiiiiiiimiii 

GGCTACTTCCCTGATTAGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

imillllliimilllilMil! Ill llillil Mill i 1 1 i M i ! 1 1 1 1 ! 1 1 1 1 M 1 1 1 1 1 1 1 1 

TACAAGCTAGTACCAGTTGAGCCAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACCAGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

llllllllllllllllllllllllliill llllllllllllllllllllllllllllllllllllllllll 

CCTGTGAGCCTGCATGGAATGGATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

lllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
TTTCATCACATGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 
290 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

iiiiiniiiii miimiiiiiiiinmiiiiiii iiiiiiiiiiiiiiiiniiiiiii mi 

CTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiii nmiimuiimiii 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

lllllllllllllllllillllllllllilllllllllllllMllllllllllllilllllllllllllll 

CTAACTAGGGAACCCACTGCTTAAGCCTCAA7AAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

iiiiMiiiiiniiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiiiiiiiiiiiiiiiiiiiiiiiiii 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 



650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

minium iiiiiiiiiiiiiiiniii iiiniiiiiii 

CCGAACAGGGACCTGAAAGCGAAAGGGAAACCA — GAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 X 690 700 710 

CCCGCACGGCAAGAGGCGAGGGGCGG 
720 730 



ACCESSION 
KEYWORDS 



SOURCE 



3. RAILEY-000-716.SE9 (1-696) 

REHTLV3 Hunan T-cell teukaeaia type III (HTLV-III) provira 

LOCUS REHTLV3 9748 bp RNA VRL 08-HAY-1992 

DEFINITION HuRan T-cell leukaenia type III (HTLV-III) proviral genone (AIDS 

virus for acquired innune deficiency syndrone) 
X01762 

acquired iRRune deficiency syndrome; direct repeat; endonuclease; 
glycoprotein; inverted repeat; protease! provirus; 
reverse transcriptase; terRinal repeat. 
Hunan inRunodef iciency virus type 1 
ORGANISH Hunan infumodef iciency virus type 1 

Viridae; ss-RNA enveloped viruses? Positive strand RNA viruses; 
Retroviridae; Lentivirinae. 
1 (bases 1 to 9748) 

felong-staalrF.T GaUoiR.d Chang^NJ., GhrayebrJ.i Papas»T.S.* 
Lautenberger* J.A.i Pearson^. L. » PetteuayrS.R. Jr. » IvanoffiL.* 
BauneisterrK.i WhitehornrE.A., Raf alski r J.A. r DoraniE.R.r 
JosephsiS.J. i StarcichiB.i Livak,K,J,> Patarca»R.f HaseltinerW. and 
Ratner*L. 

CoRplete nucleotide sequence of the AIDS virus* HTLV-III 
Nature 313, 277-284 (1985) 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



STANDARD full automatic 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 



2 (bases 1 to 9748) 

Huesing,B.A. » SaithiD.H.i Cabradilla*C.D.> BentoniC.V.i KaskyrL.A. 
and Capon r D.J. 

Nucleic acid structure and expression of the hunan AIDS/ 
lyRphadenopathy retrovirus 
Nature 313, 450-458 (1985) 
STANDARD full automatic 
FEATURES Location/Qualifiers 

1..634 

/note=°long terRinal repeat 0 
1..2 

/note=° inverted repeat" 
427.. 430 
/note= B TATA-box° 
453 

/note= a U3 region 0 
454.. 551 

/note =c, R region 0 
454 

/note=°cap site 0 
552.. 634 

/note= 9 U5 region 0 
633.. 634 

/note= u inverted repeat 0 
635.. 653 

/note= n tRNA binding site (tRNA-Lys) 0 
1968.. 2002 

/note=°direct repeat 0 
2031.. 2065 

/note=°direct repeat 0 
2128. .2163 

/note= n direct repeat 0 

reoeat rpn inn ?1A4__?t7A 



flisc_feature 

repeat_unit 

promoter 

ftisc_feature 

aisc_feature 

Risc_RNA 

Risc_feature 

repeat_unit 

Risc_feature 

repeat_region 

repeat_region 

repeat^region 



/note=°direct repeat 0 
p»isc_feature 7786. .7787 

/note= n put. peptide cleavage site 0 
nisc_feature 9098. .9103 

/note=°poly purine stretch 0 
repeat_region 9115. .9748 

/note=°long terminal repeat 0 
nisc_feature 9115.. 9567 

/note=°U3 region 0 
nisc.feature 9568.-9665 

/note="R region 0 
nisc. feature 9641.. 9646 

/note=°polyadenylation signal 0 
niscjeature 9666. .9748 

/note= n U5 region 0 
repeat_unit 9747.. 9748 

/note=° inverted repeat 0 
CDS 787.. 2321 

/note= 0 gag precursor polypeptide 0 
CDS 1183. .2321 

/note=°gag p24 and gag pl5 for najor capsid protein and 

for put. retroviral nucleic acid binding protein 

<NBP)(ref.2) (boundaries not defined) 0 
CDS 787.. 1182 

/note=°gag p!7° 

/codon_start=l 

/translation=°HGARASVLSGGELDRyEKIRLRPGGKKKYKLKHIVWASRELERF 
AVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALD 
KIEEE8NKSKKKA8QAAADTGHSS9VS8NY 0 
CDS 2081.. 5125 

/note=°pol precursor polypeptides put. protease at 5' 
terninus reverse transcriptase put. endonuclease at 3' 
terminus 0 
/codon_start=l 

/translation= I, FFREDLAFLQGKAREFSSEQTRANSPTISSE8TRANSPTRRELQ 
VMGRDNNSPSE AGADRQGT VSFNFP8 I TLW8RPLVT I K I GG8LKEALLDTGADDTVLE 
EHSLPGRyKPKHIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTfl 
IGCTLNFPISPIETVPVKLKPGMDGPKVKQHPLTEEKIKALVEICTEMEKEGKISKIG 
PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFHEVSLGIPHPAGLKKKKSVTVL 
DVGDAYFSVPLDEDFRKYTAFTIPSINMETPGIRYQYNVLP9GHKGSPAIFSSSHTKI 
LEPFKKQNPD I V I Y9 YHDDL YVGSDLE I G8HRTK I EELR9HLLRWGLTTPDKKH9KEP 
PFLyHGYELHPDKWTVePIVLPEKDSWTVNDiaKLVGKLNHASQIYPGIKVRSLCKLL 
RGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIfiKSG9G6MTY9I 
YQEPFKNLKTGKYARhRGAHTNDVKQLTEAVQKITTESIVIHGKTPKFKLPIQKETHE 
TUUTEVUflATIIIPEHEFVNTPPLVKLUY&LEKEPIVGAETFYVDGAANRETKLGKAGY 
VTNKGRQKyVPLTNTTN9KTELQAIYLALSDSGLEVNIVTDS9YALGII9A9PDKSES 
ELVN8IIE9LIKKEKVYLAWVPAHKGIGGNE9VDKLVSAGIRKILFLDGIDKA9DEHE 
KYHSNWRAHASDFNLPPVVAKEIVASCDKC8LKGEAHHG8VDCSPGIH9LDCTHLEGK 
VILVAVHVASGYIEAEVIPAETG8ETAYFLLKLAGRWPVKTIHTDNGSNFTSATVKAA 
CWHAGIK8EFGIPYNP9S8GVVESHNKELKKIIG9VRD9AEHLKTAV8MAVFIHNFKR 
KGGIGGYSAGERIVDIIATDI9TKEL8K9ITKI9NFRVYYRDSRNPLWKGPAKLLWKG 
EG AW 1 9DNSD I KVVPRRKAK 1 1 RDYGK9HAGDDCVASR8DED 0 
CDS 5040.. 5648 

/note= B S0R short open reading frane pot. vestigial env 
gene 0 

/codon_start=l 

/translation= 0 C9EEK8RSLGIHENRW9VNIVH8VDRMRIRTWKSLVKHHI1YVSG 
KARGyFYRHHYESPHPRlSSEVHIPLGDARLVITTYMGLHTGERDWHLG9GVSIEHRK 
KRYSTQVDPELAD9LIHLYYFDCFSDSAIRKALLGHIVSPRCEY8AGHNKVGSL9YLA 
LAALITPKKIKPPLPSVTKLTEDRyNKP9KTKGHRGSHTHNGH u 
CDS 6323.. 8821 

/note= n env-lor precursor polypeptide" 
/codon_start=l 

/tr3nslation= B HLHICSATEKLHVTVYYGVPVHKEATTTLFCASDAKAYDTEVHN 

VHATHACVPTDPMPaEVVLVHVTENFHHHKNDHVE8NHEDIISLyDaSLKPCVKLTPL 

CVSLKCTDLKNDTMTMSSSCRMIHFKCFIKMrSFNTSTSIRnKVflKFYAFFYKI HTTP 



I DNDTTSYTLTSCNTSVI T9ACPKVSFEP I P I HYCAPAGFA I LKCNNKTFNGTGPCTN 
VSTV9CTHGIRPVVST9LLLNGSLAEEEVV IRSANFTDNAKT I I V9LN6SVEI NCTRP 
NNNTRKS I R I QRGPGRAF VT I GK I GNHRQAHCN I SRAKUNNTLK6 1 DSKLREfiFGNNK 
T I I FKGSSGGDPE I VTHSFNCGGEFFYCNST9LFNSTHFNSTHSTKGSNNTEGSDT I T 
LPCRIK6IINHHQEVGKANYAPPISG81RCSSNITGLLLTRDGGNSNNESEIFRPGGG 
DNRDNBRSELYKYKVVKIEPLGVAPTKAKRRVVGREKRAVGIGALFLGFLGAAGSTHG 
AASflTLTVQARQLLSGIVQQSNNLLRAIEAQQHLLGLTVHGIKQLQARILAVERYLKD 
QQLLGiyGCSGKLICTTAVPHMASHSNKSLESIHNNMTHMEMDREINNYTSLIHSLIE 
ESQNQ9EKNEQELLELDKUASLWNUFNITNWLHYIKLFIP1IVGGLVGLRIVFAVLSVV 
NRVRQGYSPLSFflTHLPIPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSL 
CLFSYHRLRDLLLIVTRIVELLGRRGWEALKYHWNLLSYWSQELKNSAVSLLNATAIA 
VAEGTDRVIEVVflGAYRAIRHIPRRIRfiGLERILL" 
CDS 6323.. 8821 

/note="envelope glycoprotein" 
/codon_start=l 

/trans I at ion=° flLMI CSATEKLHVTVYYGVPVHKEATTTLFCASDAKAYDTEVHN 
VHATHACVPTDPNP9EVVLVNVTENFNMHKNDMVE9MHED1 I SLHD9SLKPCVKLTPL 
CVSLKCTDLKNDTNTNSSSGRHIHEKGEIKNCSFNISTSIRGKV6KEYAFFYKLDIIP 
IDNDTTSYTLTSCNTSVIT9ACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCTN 
VSTVSCTHGIRPVVSTQLLLNGSLAEEEVVIRSANFTDNAKTIIVQLNfiSVEINCTRP 
NNNTRKS IR 1 9RGPGRAF VT IGK IGNHR9AHCN I SRAKWNNTLK9 1 DSKLRE6FGNNK 
T 1 1 FKSSSGGDPE I VTHSFNCGGEFF YCN5TQLFNSTUFNST WSTKGSNNTEGSDT I T 
LPCRIKQI INRHQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGG 
DMRDNHRSELYKYKVVKIEPIGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTHG 
AASNTLT V9AR9LLSGI V999NNLLRA I E A9SHLL9LT VHG I K9L9AR I L AVERYLKD 
QQLLGIWGCSGKLICTTAVPUNASUSNKSLE9IHNNMTHMEWDREINNYTSLIHSLIE 
ES9N96EKNEQELLELDKWASLHNWFN I TNHLUYIKLF I HI VGGLVGLR I VFAVLSV V 
NRVRQGYSPISFOTHLPIPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSL 
CLFSYHRLRDLLLIVTRIVELLGRRGWEALKYMWNLL0YWS8ELKNSAVSLLNATAIA 
VAEGTDRVIEVV8GAYRAIRHIPRRIRQGLERILL- 
CDS 7787.. 8821 

/note="put. lor transnenbrane protein" 
/codon_start=l 

/translation= n AVGlGALFLGFLGAAGSTHGAASHTLTV8AR8LLSGIV69QNNL 
LRAIEAG9HLL9LTVHGIK9L9ARILAVERYLKD99LLGIUGCSGKLICTTAVPWNAS 
WSNKSLE8I WNNHTW JODREI NNYTSLI HSL I EES6NG8EKNE8ELLELDKHASLWN 
WFNITNHLyYIKLFIMIVGGLVGLRIVFAVLSVVNRVRSGYSPLSFSTHLPIPRGPDR 
PEGIEEEGGERDRDRSIRLVNGSLALIHDDLRSLCLFSYHRLRDLLLIVTRIVELLGR 
RGHEALKYWHNLL9YWS9ELKNSAVSLLNATAIAVAEGTDRVIEVV9GAYRAIRHIPR 
RIR8GLERILL" 

BASE COUNT 3431 a 1781 c 2368 g 2168 t 
ORIGIN 



Initial Score = 664 Optinized Score = 671 Significance = 52.01 
Residue Identity = 97X Matches = 673 Misnatches = 13 

Gaps = 3 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

X 10 20 30 40 50 60 



80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

imimimiiii mmiiiiiiimiimim iimiiiiiiiiiiiiiiiiimim 

GGCTACTTCCCTGATTAGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 



150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

llllimilllllllllllllllll III lllllll lllll lllllllllllllllllllllllllll 
TACAAGCTAGTACCAGTTCAGCCAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACCAGCTTGTTACAC 

140 150 160 170 180 190 200 



220 230 240 250 ?AO ?70 ?ftf) 



CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

IIIIIIIIMMIIIIIllIIIIIIIIll IllllMlillllllllllllllllllllMIIIIIIIIIII 

CCTGTGAGCCTGCATGGAATGGATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

lllllllll llllllllilllllllllllllllllllllllllllllllllllllllllllllllllllll 

TTTCATCACATGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 
290 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

limillllll llllllllllllllllllllllllllli lllllllllllllllllllllllll Mil 
CTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

llllllllllllillllllllllllllllllllllllllllllllllllll llllllllllllllllllll 
ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 

430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

nnilillllilllllllllllllllllllllllllllllllllllllllillilllllllllllllllll 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

lillMIIIIMIIIMIIMiiMMIIiMiiMIIMilllillllililillllllMIIMIIIIII 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 
IIIIIIIIMII llllllllllllllllllll lllllllllfll 

CCGAACAGGGACCTGAAAGCGAAAGGGAAACCA — GAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 X 690 700 710 

CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



4. RAILEY-000-716.SEQ (1-696) 

HIVH3CG Hunan T-cell lynphotropic virus type IIL conplete 

ID HIVH3CG standard; RNA; VRL; 9749 BP. 
XX 

AC K02010; K02008; K02009; 
XX 

DT 18-N0V-1986 (Rel. 10, Created) 

DT 23-0CT-1992 (Rel. 33. Last updated. Version 4) 

XX 

DE Hunan T-cell lynphotropic virus type II I r conplete reference genone 

DE (isolates HXB2, HXB3, BH10, BH5 and BH8 of HTLV-III DNA) . 

XX 

KW acquired innune deficiency syndroRe; conplete genone; env gene; 

KW gag gene; long terminal repeat; pol gene; polyprotein; provirus; 

Ky reverse transcriptase; tar protein; trans-activator. 
XX 

OS Hunan innunodef iciency virus type 1 

0C Viridae; ss-RNA enveloped viruses; Positive strand RNA viruses; 

0C Retroviridae; Lentivirinae. 

XX 



RN El] 

RP 1-653, 9116-9749 

RA Starcich &.> Ratner L.i Josephs S.F.r Okanato T.i Gallo R.C.» 

j RA Hong-staal F.; 

; RT "Characterization of long terminal repeat sequences of HTLV-Iir? 

■jw RL Science 227:538-540(1985). 
XX 

RN [23 

RP 1-9749 

\ RA Hong-staal F., Gallo R.C. r <Chang ^N.I2* Ghrayeb J.i Papas T.S.» 

i RA Lautenberger J. A., Pearson H.L.i Petteyay S.R.Jr.i Ivanoff L.r 

' RA Bauneister K.» Whitehorn E.A.* Rafalski J. A., Doran E.R.* 

| ] RA Josephs S.J., Starcich B,» Livak K.J.r Patarca R.» Haseltine ti.» 

RA Ratner L.i 

; ; RT "Conplete nucleotide sequence of the AIDS virus, HTLV-IIT; 

- > RL Nature 313:277-284(1985). 
V XX 

RN [33 

RC exons only* tat nrna 

RP 508-9666 

RA Arya S.K.i Guo C> Josephs S.F.r Wong-staal F.; 

RT a Trans-activator gene of hupian T-lynphotropic virus type III 

RT (HTLV-III) 0 ; 

RL Science 229:69-73(1985). 

XX 

RN [43 

RP 5775-6082, 8397-8499 

RA Sodroski J.G.» Patarca R.r Rosen C.A.i Wong-staal F.r Haseltine H.J 

RT ^Location of the trans-activating region on the genone of hunan 

RT T-cell I ynpho tropic virus type III"; 

RL Science 229:74-77(1985). 
XX 

RN [53 

RC nrna splice sites 

RA Rabson A.B., Daugherty D.F., Venkatesan S.r Boulukos K.e., 

RA Benn S.I.i Folks T.M., Feorino P., Martin N.; 

RT "Transcription of novel open reading franes of AIDS retrovirus 

RT during infection of lymphocytes 8 ; 

RL Science 229:1388-1390(1985). 

XX 

RN [63 

• RC 27k antigen cds 

RA Allan J.S.i Coligan J.E., Lee T.H.i McLane M.F., Kanki P.J., 

RA Groopnan J.E.i Essex M.; 

RT °A new HTLV-III/LAV encoded antigen detected by antibodies fron 

RT AIDS patients 0 ? 

RL Science 230:810-813(1985). 

XX 

RN [73 

RC in hxb-3 

RP 5773-8933 

RA Croul R.r Ganguly K., Gordon H.r Conroy R.r Schaber H., Kramer R.i 

RA Shau G.r Wong-staal F.r Reddy E.P.; 

RT D HTLV-III env gene products synthesized in E. coli are recognized 

RT by antibodies present in the sera of AIDS patients"; 

RL Cell 41:979-986(1985). 
XX 

RN [33 

RC gpl60 and gpl20 coding sequences 

RA Allan J. S.r Coligan J.E.r Barin F.» McLane H.F.r Sodroski J.G.i 

RA Rosen C.A.» Haseltine H.A.r Lee T.H., Essex M.; 

RT a Major glycoprotein antigens that induce antibodies in AIDS 

RT patients are encoded by HTLV-IIPi 

RL Science 228:1091-1094(1985). 

XX 

RN [93 



RC regulatory sequences in the Itr 

RA Rosen C.A.. Sodroski J.G., Haseltine H.A.; 

RT "The location of cis-acting regulatory sequences in the hunan T 

RT cell lyaphotropic virus type III (HTLV-III/LAV) long terninal 

RT repeat"; 

RL Cell 41:813-823(1985). 
XX 

RM CIO] 

RP 1-9749 

RA Van Beveren C, Coffin J.H., Hughes S.J 

RT "Appendix B: HTLV-3/LAV genone"; 

RL (in) Heiss R., Teich N., Variuis and Coffin J.N. (eds.); 

RL RNA TUMOR VIRUSES SECOND EDITION: 1102-1 148; 

RL Cold Spring Harbor Laboratory, Neu York (1985) 
XX 

RN til] 

RC trans-activator function and tar sequence 

RA Rosen C.A.» Sodroski J.G.r Goh y.C.r Dayton A.I.* Lippke J. r 

RA Haseltine H.A.; 

RT "Post-transcriptional regulation accounts for the trans-activation 

RT of the huwan T-lynphotropic virus type III 0 J 

RL Nature 319:555-559(1986). 
XX 

RN E 123 

RC pol coding sequence 

RA Marzo Veronese F., Copeland T.D.i DeVico A.L., Rahwan R. 7 

RA Oroszlan S.» Gal lo R.C., Sarngadharan N.G.; 

RT "Characterization of highly ipwunogenic p66/p51 as the reverse 

RT transcriptase of HTLV-I II/LAV B ; 

RL Science 231 t 1289-1291 (1986) . 

XX 

RN [133 

RC the 23k sor gene product 

RA Kan N.d Franchini G.r Wong-staal F.t DuBois G.d Robey W.G.r 

RA Lautenberger J. A.* Papas T.S.; 

RT "Identification of HTLV-III/LAV sor gene product and detection of 

RT antibodies in hunan sera"; 

RL Science 231 : 1553-1555 (1986) . 
XX 

RN [143 

RC pol nh2-terninal region 

RA KraRer R.A., Schaber H.D.r Skalka A.H.* Ganguly K. > Wong-staal F., 

RA Reddy E.P.J 

RT "HTLV-III gag protein is processed in yeast cells by the virus 

RT pol-protease"; 

RL Science 231:1580-1584(1986). 

XX 

RN [151 

RC sor 23k protein 

RA Lee T.H.* Coligan J.E.i Allan J.S.f FIcLane H.F.» Groopnan J.E.t 

RA Essex H. ; 

RT "A neu HTLV-III/LAV protein encoded by a gene found in cytopathic 

RT retroviruses"; 

RL Science 231:1546-1549(1986). 

XX 

RN [163 

RC sor 23k protein 

RA Sodroski J.G.> Goh W.C> Rosen C.A.* Tartar A., Portetelle D.r 

RA Burny A.i Haseltine H.i 

RT "Replicative and cytopathic potential of HTLV-III/LAV uith sor 

RT gene deletions"; 

RL Science 231:1549-1553(1936). 

XX 

RN C17 3 

RC spl binding sites in the proaoter region 

RA Jones K.A.i Kadonaqa J.T.r Luciw P.A.r Tiian R.i 



RT 
RT 
RL 
XX 
RN 
RC 
RA 
RT 
RT 
RT 
RL 
XX 
RN 
RC 
RA 
RT 
RT 
RL 
XX 
RN 
RC 
RA 
RA 
RT 
RT 
RT 
RL 
XX 
RN 
RC 
RA 
RA 
RT 
RT 
RL 
XX 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 



"Activation of the AIDS retrovirus promoter by the cellular 
transcription factori Spl"; 
Science 232:755-759(1986). 

1181 

acceptor and donor splice sites for tat and 27k 
Arya S.K.. Gal lo R.C.; 

"Three novel genes of hunan T-lyftphotropic virus type III: iRRune 
reactivity of their products with sera fron acquired innune 
deficiency syndrone patients "J 
Proc. Natl. Acad. Sci. U.S.A. 83:2209-2213(1986). 

C191 

deletion Rutants in the tat gene 

Dayton A.I.. Sodroski J.d Rosen C.A., Goh W.C.i Haseltine U.A.; 
"The trans-activator gene of the hunan T cell lynphotropic virus 
type III is required for replication"; 
Cell 44:941-947(1986). 

1201 

hypervariable and conserved regions in the env gene 

Willey R.H.i Ruthledge R.A.. Dias S.i Folks T., Theodore T.S.> 

Buckler C.E.i Martin H.A.; 

"Identification of conserved and divergent donains uithin the 
envelope gene of the acquired immunodeficiency syndroR 
retrovirus"; 

Proc. Natl. Acad. Sci. U.S.A. 83:5038-5042(1986). 
C 21 3 

art cds boundaries 

Sodroski J.G.. Goh U.d Rosen C.A.> Dayton A.r Teruilliger E.> 
Haseltine W.; 

"A second post-transcriptional trans-activator gene required for 
HTLV-III replication"; 
Nature 321:412-417(1986). 



EPD; 14085 
SWISS-PROT 
SWISS-PROT 
SHISS-PROT 
SWISS-PROT 
SHISS-PROT 
SWISS-PROT 
SWISS-PROT 
SWISS-PROT 
SWISS-PROT 
SWISS-PROT 
SWISS-PROT 
SWISS-PROT 



HIV-1 (HTLV-I 



P03347 
P03366 
P03375 
P03401 
P03404 
P04606 
P04616 
P04617 
P04624 
P05854 
P05920 
P05926 



GAG. 
POL. 
ENV 
VIF. 
NEF. 
TAT. 
REV. 
REV. 
ENV. 
NEF. 
VPU. 
VPR 



II) LTR. 

HIV10. 

HIV10. 

HIV10. 

HIV10. 

HIV10. 

HIV10. 

HIV10. 

HIV1P. 

HIV1Y. 

HIV1Y. 

HIV10. 

HIV10. 



Sequence for [7] uas kindly supplied in computer readable forn by 
R. Cr-ouli 09/17/85. R. Patarca provided sites inforRation and a 
clean copy for C 4 3. 09/16/85. Acquired innune deficiency syndrone 
(AIDS) is caused by a retrovirus knoun by several naRes> perhaps 
representing too separate strains: huRan T-cell lynphotropic 
virus-Ill (HTLV-III), whose sequence is given belou, and 
lynphadenopathy-associated virus (LAV) are thought to be one strain 
differing fron AIDS-associated retrovirus type 2 (ARV-2) when 
overall honology is the criterion. Sone reading frane sinilarities 
suggest that ARV-2 and LAV are nore closely related. All three 
viruses, whose sequences do not differ by nore than 6%i are 
believed to belong to the C type subfamily Lentiviridaei the "slow" 
retroviruses. The BH10 sequence differs fron 8H8 and BH5 by 0.9Z in 
the coding regions and 1.8% in the noncoding regions • and the 
authors of [2] believe that these are stable variants. The 5' and 
3' LTRs of BH10 and BH8 were not. fullu seouenrerf' the p ; f>r, ; rv 



CC (493-675 and 9608-9749) were filled in by [23 from the proviral 

CC clone HXB2 [1]. The sequence below is that of BH10 uith exception 

CC of the variation at position 9197 which allows annotation of the 

CC 27K coding sequence. The BH8 sequence spans bases 6033 to 9607 , the 

CC BH5 sequence spans bases 675 to 6038* and the HXB3 sequence [73 

CC spans bases 5778 to 8933. While this entry is off erred as the 

CC reference locus for the AIDS retroviral sequence loci, no claim is 

CC being made that this sequence is more prevalent or typical than 

CC others* all of which have been entered in this library with 

CC annotation. The HTLV-III genome encodes at least six proteins or 

CC polyproteinsi gag, pol* env* TAT, 27K antigen and the sor 23K 

CC product. The 3' ORF (positions 8797-9447) is truncated in BH10 

CC (stop codon at positions 9196-9198)* but reads through in BH8 and 

CC other sequences to yield what is now called the 27K antigen. The 

CC sequence below is from BH10 uith exception of the variation at 

CC position 9197 which allows annotation of the 27K coding sequence. 

CC Additionally there are four short open reading frames* bases 

CC 1248-1406, 4442-4642, 5592-5828 and 6095-6340, uhich are conserved 

CC to a large degree. A seventh gene has been proposed based upon a 

CC combination of mutational and regulatory evidence: called B ART U ( 

CC for anti-repression transactivator) * its product appears to act 

CC post-transcriptionally to relieve negative repression of gag and 

CC env production [213. The exon assignments for ART are putative, but 

CC if they are corroborated, the ART protein would be 116 amino acids 

CC in length. The mechanism for pol gene translation has not been 

CC elucidated? a gag-pol fusion protein is possible; splicing or 

CC frameshift have not been ruled out. The viral protease would be 

CC determined by the region in question. Approximately two-thirds of 

CC the variant sites in the gag and pol genes are "silent mutations 0 , 

CC while over half of those in the env gene are not. Reference [203 

CC defines divergent and conserved regions for the env gene. Because 

CC of the excessive variability of the env gene, differences between 

CC the sequences summarized herein and other env gene entries have not 

CC been annotated; only HTLV-III sequence variations have been 

CC included in the sites of this entry. Other entries will include 

CC information for alignment with this entry, including the Zaire and 

CC New York isolate sequences reported by [203. The TAT protein 

CC (trans-activator protein, approximately 14 kd) is an effector of an 

CC autostimulatory pathway through interaction with a positive control 

CC element, the trans-activating responsive sequence, TAR. TAT seems 

CC to be a transcriptional control molecule in HTLV-I, but [113 

CC demonstrates that it is a post-transcriptional regulatory molecule 

CC in HTLV-III. Deletion mutants in the TAT gene are incapable of 

CC prolific replication and exhibit no cytopathic effects in T4+ cell 

CC lines [193. The TAR sequence(s) are found to be between -17 and +80 

CC relative to the cap site +1 (base 455) and is highly conserved. 

CC Enhancer sequences which need not be viral-specific are found 

CC upstream from TAR [93,[113. Three tandem decanucleotide Spl binding 

CC sites are located between bases 377 and 409* of which site III 

CC shows the strongest affinity for the cellular factor; intact, the 

CC three sites cause up to a tenfold effect on transcriptional 

CC efficiency in vitro ([17 3 (The authors demonstrate the existence of 

CC Spl in a human T-cell line). In addition to the "9.4 kb genomic 

CC mRNA, subgenomic mRNAs of 7.4* 5.5, 5.0, 4.3* 2.0 and 1.8 have been 

CC detected. All are probably polyadenylated at the same site, 

CC position 9666 below* with a potential polyadenyation signal at 

CC 9642-9648, and capped at the same site, position 455, with a 

CC potential TATA box at 427-431. The doubly-spliced transcript of 

CC about 2.0 kb is responsible for the TAT message at least* and 

CC depending upon the acceptor site* also for the sor and 27K 

CC messages, given that a single, albeit partial, mRNA exists for all 

CC three [183. The acceptor splice for TAT is at position 5811 and the 

CC putative acceptor splice for 27K is at position 6010* the donor 

CC splice site in all three cases would be at position 6079 (183. The 

CC doubly spliced message would also encode the newly proposed ART 

CC protein. 



;x 

H 
H 



Key 

repeat_region 

repeat_region 

variation 

variation 

variation 

variation 

variation 

variation 

variation 

variation 

variation 

variation 

nisc_feature 

flisc_feature 

pusc_feature 

variation 

repeat_region 

repeat_region 

nisc.RNA 

nisc RNA 



variation 

nisc_Feature 

variation 

variation 

variation 

CDS 

variation 
variation 
variation 
variation 
variation 



Location/Sualifiers 
i . .634 

/note=°5' LTR° 
1..634 

/note=°5' LTR U 
82.. 82 

/note= u a in BHIO; g in H9° 
101. .101 

/note=°g in BH10? a in H9° 
108. .108 

/note=°a in [23, H9; g in HXB2 ti3° 
164. .164 

/note= u g in [23; t in HXB2 [13, H9 n 
168. .168 

/note=°t in [23; g in HXB2 £13, H9 D 
176. .176 

/note=°a in [23; g in HXB2 [13, H9 B 
133. .183 

/note=°c in [23, H9; t in HXB2 [13 d 
227.. 227 

/note=°a in [23, H9; g in HXB2 [13° 
291. .291 

/note=°a in [23; g in HXB2 CI 3, H9° 
333.. 333 

/note= D c in [23; t in HXB2 [13, H9° 
377.. 386 

/note= n Spl binding site III [173 n 
388.. 397 

/note=°Spl binding site II [173° 
399.. 408 

/note= a Spl binding site I £173" 
421.. 421 

/note= B c in BH10, BH5; t in H9 H 
454.. 551 

/note= n R repeat 5' copy a 
454.. 551 

/note= n R repeat 5* copy" 
455.. 455 

/note=°genoflic nRNA start (cap site) [103° 
455.. 455 

/note=°TAT,ART PtRNA exon 1 start (cap site) [103, 

[183,1213° 

501.. 501 

/note=°a in BHIO, BH5, H9; g in HXB2 [13 B 
636.. 653 

/note=°pripier (Lys-tRNA) binding site 0 
654.. 654 

/note=°c in BH10, BH5; t in H9 n 
677.. 677 

/note=°g in BH10, BH5; ggag in H9° 
704.. 704 

/note=°tga in BHIO, H9; g in BH5 [23° 
787.. 2325 

/note=°gag polyprotein precursor 0 
1290.. 1290 

/note=°a in BH10; g in BH5 [23, H9 n 
1431. .1431 

/note=°a in BHIO; g in BH5 [23, H9 D 
1455. .1455 

/note=°t in BHIO, H9; c in BH5 [23 a 
1611. .1611 

/note= a a in BHIO, H9; g in BH5 [23° 
1620.. 1620 

/note=°c in BHIO, H9: t in TtHS m° 



variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
CDS 



variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 



1656.. 1656 

/note= n a in BH10, H9; g in BH5 [23 B 
1662.. 1662 

/note= a t in BH10; c in BH5 [23, H9° 
1675.. 1675 

/note= n g in BH10, BH5; c in H9 a 
1722.. 1722 

/note=°g in BH10» H9; a in BH5 [23° 
1806.. 1806 

/note="g in BH10, BH5? a in H9 a 
1845.. 1845 

/note=°a in BH10, BH5; g in H9 n 
1903.. 1903 



a 
t 
t 



n BH5 t2] D 

n BH5 [2]° 

n BH5 C23 D 

n BH5 [23° 

n BH5 C2]° 



n BH5 [23 u 
n BH5 [23° 
n BH5 [23° 
a in BH5 C23° 



/note= fl a in BH10, H9 
1906.. 1906 

/note= B g in BH10> H9 
1923.. 1923 

/note= D g in BH10, H9 
1950.. 1950 

/note=°g in BH10> H9 
1953.. 1953 

/note=°g in BH10, H9 
1988.. 1988 

/note=°c in BH10, H9 
1992.. 1992 

/note=°c in BH10> H9 
2003.. 2003 

/note="g in BH10, H9 
2013. .2013 

/note= n g in BH10. H9 
2391.. 5129 

/note=°pot polyprotein (NH2-terflinu5 uncertain; AA 
at 2391)" 
2468.. 2468 

/note= u g in BH10, BH5; a in H9 n 
2591.. 2591 

/note= D c in BH10> H9; t in BH5 [23° 
2600.. 2600 

/note=°g in BHld H9; a in BH5 E2] u 
2741.. 2741 

/note=°g in BH10J a in BH5 [23, H9 B 
2827.. 2827 

/note="a in BH10. H9; g in BH5 [23 d 
2853.. 2858 

/note=°a in BH10, H9; g in BH5 [23° 
2990.. 2990 

/note=°c in BH10, H9; t in BH5 [23 n 
3007.. 3007 

/note=°tta in BH10, H9; gtg in BH5 [23° 
3097.. 3097 

/note= fl a in BH10? g in BH5 [23, H9 U 
3122. .3122 

/note=°c in BHIOr H9; t in BH5 [23 d 
3222.-3222 

/note= D c in BH10, H9; t in BH5 [23 d 
3302.. 3302 

/note=°ag in BH10, H9; ga in BH5 [23° 
3368.. 3368 

/note="g in BH10, H9; a in BH5 [23 fl 
3389.. 3389 

/note=°g in BH10, BH5; a in H9° 
3395.. 3395 

/note= a c in BH10, H9; t in BH5 [23 d 
3755.. 3755 

/note= a a in BH10, BH5; g in H9° 
3767.. 3767 



variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
CDS 

variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
CDS 

variation 
CDS 



variation 



variation 



/note= a g in BHIO, H9f a in BH5 [23° 
3833.-3833 

/note=°t in BHIOx BH5; c in H9° 
3855.. 3855 

/note= a t in BHIO, BH5; c in H9 n 
389?.. 3899 

/note= a c in BHiO, BH5; t in H9 n 
3922.-3922 

/note= a a in BHIO, H?f g in BH5 E23 n 
3934.. 3934 

/note=°a in BHIO, BH5; g in H9° 
3954.. 3954 

/note= n g in BHIO, BH5; c in H D 
3962.-3962 

/note= a caa in BHIO, H9? tag in BH5 E23 a 
3977.-3977 

/note=°g in BHIO, H9; a in BH5 [2]° 
3934.-3984 

/note= n c in BHIO, H9; a in BH5 C2] a 
3993.-3993 

/note=°a in BHIO, H9; c in BH5 C2]° 
4010. .4010 

/note= u a in BHIO; g in BH5 C23, H9° 
4016. .4016 

/note= D g in BHIO, H9; a in BH5 [21° 
4029.. 4029 

/note= H t in BHiO, H9; c in BH5 C23° 
4049.. 4049 

/note= a a in BHIO; g in BH5 [21, H9° 
4064.. 4064 

/note= D c in BHIO, H9; t in BH5 [21° 
4116. .4116 

/note=°a in BHIO, BH5; c in H9° 
4167. .4167 

/note=°g in BHIO, BH5; c in H9° 
4292.-4292 

/note=°t in BHIO, H9; a in BH5 (2] fl 
5074.. 5652 

/note= u sor 23K protein 0 
5156. .5156 

/note=°a in BHIO, H9; g in BH5 C2] u 
5314. .5314 

/note="t in BHIO, BH5; c in H9° 
5348.. 5348 
/note=°a in BHIO, H9 
5401.. 5401 

/note= u t in BHIO, H9 
5412. .5412 
/note= n c in BHiO, H9 
5548.. 5548 
/note= B a in BHIO, H9 
5628.-5628 

/note= n g in BHIO, H9 
5846.-5846 

/note= fl g in BHIO, H9, HXB3? a in BH5 E2]° 
5864.. 6078 

/note=°TAT protein, exon 2 (first expressed exon) 
5934.-5934 

/note=°a in BHIO, H9, HXB3; c in BH5 E23° 
6003.. 6078 

/note= fl ART protein, exon 2 (first expressed exon; 

putative) 0 

6035.. 6045 

/note= 8 cctcetcaagg in BH10,HXB3 [73; gctcatcgaag 
in BH8 C23; g in BH5 [23, clone 12 cDNA [213° 
6086.. 608A 



g in BH5 C23 a 
c in BH5 [23° 
t in BH5 [23 a 
g in BH5 [23° 
a in BH5 [23° 



variation 
variation 
variation 
variation 
variation 
CDS 

variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 
variation 



/note= a g in BH10, BH8, H9 ? a in HXB3 C73° 
6096.. 6096 

/note= B t in BH10, HXB3 [73, H9; c in BH8 [23 d 
6108.. 6108 

/note=°a in BHtOi HX83 [73, H9; c in BH8 [23° 
6113. .6114 

/note= n gc in BH1CHXB3 [73, H9; gtaac in BH8 [33° 
6124. .6124 

/note=°a in BH10, HXB3 [73, H9; c in BH8 E23 D 
6152. .6152 

/note=°g in BH10> HXB3 [73, BH8; c in H9° 
6255.. 8825 

/note=°envelope protein precursor (env) a 
6373.. 6373 

/note=°a in BH10, HXB3 [73, H9; t in BH8 [23° 
6474.. 6474 

/note=°t in BH10, BH8 [23, H9; g in HXB3 [73° 
6748.. 6748 

/note= a t in BH10, HXB3 [73, H9? a in BH8 [23° 
6929.. 6929 

/note= a t in BH10, HXB3 [73, H9; c in BH8 [23 d 
7088.. 7088 

/note= a a in BH10, H9; g in BH8 [23, HXB3 [73° 
7119. .7119 

/note=°a in BH10? HXB3 [73, H9; g in BH8 [23° 
7121. .7123 

/note=°cca in BH10,H9; cac in BH8 [23,HXB3 [73° 
7171. .7172 

/note= B gt in BH10, H9; aa in BH8 [23, HXB3[73 a 
7187. .7187 

/note= n a in BH10, H9; g in BH8 [23, HXB3 [73° 
7272.-7273 

/note= n aa in BH10, H9; gc in BH8[23, HXB3 [73° 
7291.. 7291 

/note= u a in BH10, BH3 [23, H9; c in HXB3 [73° 
7343,-7343 

/note= (i g in BHtO, BH8 [23; a in HXB3 [73, H9° 
7439.. 7454 

/note= fl gtttaatagtacttgg in BHIO* HXB3 [73, and H9 a 
7461.. 7461 

/note=°a in BH10, BH8 [23; g in HXB3 [73, H9° 
7499.-7499 

/note=°c in BH10, BH8 [23? a in HXB3 [73, H9° 
7521.. 7521 

/note= B a in BH10, BH8 [23; t in HXB3 [73, H9 a 
7574.. 7574 

/note=°t in BH10, CHS [23; c in HXB3 [73, H9 B 
7636.-7637 

/note= u cg in BH10, HXB3 [73, H9; gc in BH8[23° 
7636.-7636 

/note=°g in BH10, BH8 C23; a in HXB3 [73, H9° 
7645.-7645 

/note=°a in BH10, BH8 [23, H9i g in HXB3 [73 d 
8060.. 8061 

/note=°ca in BH10, BH8 [23, H9; ac in H° 
8127. .8127 

/note= n a in BH10, BHS [23, H9; c in HXBC7 J u 
8131. .8131 

/note= a t in BH10, BHS [23, H9; c in HXB3 [73° 
8135. .8135 

/note=°c in BH10, BH8 [23, H9; g in HXB3 [73 B 
8257.. 8257 

/note^g in BH10, BH8, HXB3; a in H9° 
8273.-8273 

/note= a t in BH10, BH8, HXB3; g in H9° 
8364.-8364 



CDS 
CDS 

variation 



variation 



variation 



variation 



variation 



variation 



variation 



variation 



CDS 



variation 



variation 



variation 



variation 



variation 



variation 
variation 



variation 
repeat_region 
variation 
variation 



variation 
variation 



variation 



/note= a g in BH10, HXB3 [71; a in BH8 [2], H9 D 
8409.. 3683 

/note=°ART proteinrexon 3 (putative; AA at 8411) 11 
8409.. 8454 

/note= Q TAT protein, exon 3 ( AA at 8410)° 
8422.. 8422 

/note= B t in BH10,HXB3 [73, clone 12 cDNA [21]? a in 
BH8 [23; c in H9 n 
8464.. 8464 

/note= a g in BHiOrBHSr HXB3»clone 12 cDNA [21]; a in 
H9 B 

8657.. 8657 

/note= B g in BH10 ? BH8 [2]; a in HXB3 [73»H9, clone 
12 cDNA [21]° 
8672.. 8672 

/note=°g in BH10»HXB3 [7], clone 12 cDNA [213,H9? a 

in BH8 [2]° 

8692.-8692 

/note=°g in BH10,HXB3 [7], clone 12 cDNA [2I3.H9; a 
in BH8 [23° 
8748.. 8748 

/note= B g in BH10,HXB3 [73, clone 12 cDNA [213,H9; t 
in BH8 [23° 
8758.. 8758 

/note=°g in 8H10,H9; c in BH8 [2]; a in HXB3 [7]r 
clone 12 cDNA [213" 
8771 . .8771 

/note=°t in BH10,HXB3 [73, clone 12 cDNA [213,H9; c 
in BH8 [23 n 
8327.. 9447 

/note= u 27K protein, exon 3 (first expressed exon)" 
8857.. 8857 

/note=°g in BH10,BH8,HXB3,clone 12 cDNA [213; a in 
H9 n 

8924.-8924 

/note= H c in BH1CHXB3 [73, clone 12 cDNA [213.H9; t 
in BH8 [23° 
3967.. 8967 

/note= n c in BH10, clone 12 cDNA [213,H9; t in BH8 
[23° 

8978.-8978 

/note=°a in BH10,clone 12 cDNA [213rH9; c in BH8 
[23 u 

8985.. 3985 

/note=°t in BH10, clone 12 cDNA [213,H9; c in BH8 
[23 H 

8937.-3987 

/note= n a in BH10,BH8; c in H9idone 12 cDNA [213° 
3994.-8994 

/note=°c in BK10, clone 12 cDNA [213,H9; t in BH8 
[23° 

9019. .9019 

/note= a g in BH10»BH8; a in H9, clone 12 cDNA [213° 
9116. .9749 
/note= B 3' LTR° 
9169.. 9196 

/note=°t in 8H10, clone 12 cDNA [213; c in BH8 [23* 
9197. .9197 

/note= H g in BH3 [23, H9, clone 12 cDNA [213; a in 
BH10 [23 d 
9216. .9216 

/note=°g in BH10,BH8; a in H9, clone 12 cDNA [213° 
9222.. 9223 

/note= u ga in BH10, clone 12 cDNA [2i],H9; ag in 

BH8[23 a 

9279.. 9279 



/note=°g in BH10,BH8, clone 12 cDNA [21]; t in H9° 
variation 9283.. 9283 

/note=°t in BH10,BH8, clone 12 cDNA £213; g in H9° 
variation 9284.. 9284 

/note=°t in BH10,H9, clone 12 cDNA [21]; a in BH8 
121° 

variation 9291.. 9291 

/note=°a in BH10,BH8, clone 12 cDNA [21]; g in H9° 
variation 9297.. 9297 

/note=°c in BH10, clone 12 cDNA E213,H9; t in BH8 
12V 

variation 9354.. 9354 

/note=°g in BHiOr HIVDSH), H9; t in BH8 [2] D 
variation 9406, .9406 

/note= D a in BHlOrBHS; g in H9, clone 12 cDNA [21 V 
variation 9448.. 9448 

/note= 8 c in BH10; t in BH8 [2LH9rdone 12 cDNA 0 
variation 9536. .9563 

/note=°c in BH10.BH8, clone 12 cDNA [21]; g in H9 U 
repeat_region 9570. .9666 

/note= n R repeat 3' copy 0 
variation 9616. .9616 

/note=°g in HXB2; a in H9, clone 12 cDNA E21] n 
variation 9621.. 9621 

/note=°g in HXB2; a in H9, clone 12 cDNA [21]° 
variation 9663. .9663 

/note= u t in BH10.H9; tg in clone 12 cDNA [21 ] D 
polyA_site 9666. .9666 

/note=°TAT,ART,27K rRNA exon 3 end (poly-A site) 
[10], [18], [21]" 
polyA_site 9666.. 9666 

/note="genonic nRNA end (poly-A site) [10]° 
XX 

SQ Sequence 9749 BP; 3431 A; 1781 C 2369 G? 2168 T; 0 other; 

Initial Score = 664 Optimized Score = 671 Significance = 52.01 
Residue Identity = 97X Matches = 673 Hisnatches = 13 

Gaps = 3 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

IINIllIlllfllflMllllllilillllMMIIIIIIIIIIIIillHIflllllilllll 

TGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

IIIDIlllillllll lllllMimmilUlllllll lllllllMIIIIIIIIIillllllilll 
GGCTACTTCCCTGATTAGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

iimiiiimiiiiiimiiiii in iiiiiii inn imiiimmiiiiimimii 

TACAAGCTAGTACCAGTTGAGCCAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACCAGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

i 1 1 1 r ! i t I t 1 i f i 1 i i 1 i 1 i 1 i 1 i i 1 i i imiimiimmmiiiiiiiiiiiiiiiiiiiiii 

CCTGTGAGCCTGCATGGAATGGATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
210 220 230 240 250 260 270 280 



290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 
i I II I H M ! It i II 1 1 1 1 1 1 1 1 1 1 1 U 1 1 1 1 ( I f 1 1 H I II M ! ! M ! 1 1 i U If 1 ! I f f f i ' M ! f f U 



TTTCATCACATGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 
290 300 310 320 330 340 350 



370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

minium iiimiiiiimmiiiiiiiMii imiimiiiiiiiimiiiii mi 

CTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATT7GAGCCTGGGAGCTCTCTGG 

IIIIIMIIIflllNllillllllllllllNIIMIIMIfllllllll IMIIIIIIIilllllllll 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

illMMIItllllllililllNliMilliiilllllillllilitllllllllllllllllllllMII 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

[ltlllillfll!llll!lllll!!lllilll!!l!li!llll!lillllllll!llllllllllllllin 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

llllllllllll lIlllllllillllMlill IIIIIIMIIIi 

CCGAACAGGGACCTGAAAGCGAAAGGGAAACCA — GAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 X 690 700 710 



CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



5. RAILEY-000-716.SEG (1-696) 

HIVJRCSF Hunan isFtunodef iciency virus type li isolate JRCSF 



LOCUS 

DEFINITION 

ACCESSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
JOURNAL 
STANDARD 

CONHENT 



FEATURES 



HIVJRCSF 9540 bp ss-RNA VRL 28-SEP-1992 

Huaan iftnunodef iciency virus type 1, isolate JRCSF; complete 
genone. 
H38429 

long terminal repeat (LTR). 

HIV- 1 pr-oviral DNA fron extracellular virus taken froa cerebral 
spinal fluid (1986). Infectious clone. 
Hunan ifwunodef iciency virus type 1 

Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; 
Retroviridae; Lentivirinae. 
1 (bases 1 to 9540) 
KoyanagirS. and Chen, I. S. 

Unpublished (1988) UCLA School of Hedicine, Los Angeles, 
full autonatic 

Kindly provided in conputer-readable forn by Irvin Ghent UCLA 
School of Medicine, Los Angeles. JRCSF and JRFL (see <HIVJRFL> were 
isolated fron cerebral spinal fluid and brain tissue of the patient 
JRi uho died uith Kaposi's sarcona and severe AIDS encepha- lopathy 
(Science 236, 819-822, 1987). Both clones are infectious, but JRFL 
productively infects macrophages while JRCSF does not. (Peripheral 
blood uas not available froa the patient). 

The JRCSF and JRFL env nucleotide sequences differ by at least 3X; 
further characterization of then is forthcooing (Peng,S. et al., 
Nature 1990, in press). Both manifest insertions in nef previously 
reported for HIVBRVA. 

LocaU fin / flu a 1 i f i ppq 



LTR 1..635 

/partial 

protein_bind 378. .387 

/bound.fioiety= a Spl" 
proteinjnnd 389.. 398 

/bound_Roiety= n Spr 
protein_bind 400.. 409 

/bound_noiety=°Spl° 
exon 5842.. 6056 

/nunber=2 

/gene=°tat D 
exon 5981.. 6056 

/nunber=2 

/gene=°rev a 
exon 8366.. 8456 

/nunber=3 

/gene=°tat n 
exon 8366.. 8640 

/nufiber=3 

/gene =t, rev D 
LTR 9103.. 9540 

/partial 
CDS <2085..51O8 

/gene= u pol n 

/product= p pol polyprotein 0 
/todon_$tart=l 

/trans lation=°FFREDLAFL9GKAREFPSE9TRANSPTRRELSVHGRDSNSLSEA 
GAEAGADRQGIVSFNFP8ITLWQRPLVTIKIGG6LKEALLDTGADDTVLEDMDLPGRH 
KPKtllGGIGGFIKVRQYDQIPIDICGHKAVGTVLVGPTPVNIIGRNLLTQIGCTLNFP 
ISPIETVPVKLKPGttDGPKVKSWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTP 
VFA I KKKDSTKHRKLVDFRELNRRT9DFHEV9LGIPHPAGLKKKKSVTVLDVGDA YFS 
VPLDKDFRKYTAFTIPSINNETPGIRY9YNVLPQGWKGSPAIFQSSHTKILEPFRK9N 
PDI I IY9YHDDLYVGSDLEIG9HRTKIEELR9HLLKyGFTTPDKKH9KEPPFLHHGYE 
LHPDKWTV9PIVLPEKDS«TVNDI9KLVGKLNHAS9IYAGIKVK9LCKLLRGTKALTE 
VIPLTKEAELELAENREILKEPVHGVYYDPSKDLIVEI9K9G9G9HTY9IF9EPFKNL 
KTGKYARTRGAHTNDVK9LTEAV9KIANESIVIHGKIPKFKLPI9KETWETHHTEYHS 
ATHIPEyEFVNTPPLVKL«Y9LEKEPIVGAETFYVDGAANRETKLGKAGYVTSRGR9K 
VVSLTDTTNQKTELQAIHLALQDSGLEVMIVTDS8YALGI I8AQPDKSESELVSQIIE 
QLIKKEKVYLAyVPAHKGIGGNE9VDKLVSAGIRKVLFLDGIDKA9EDHEKYHSNWRA 
MASDFNLPPIVAKEIVASCDKCQLKGEAHHGQVDCSPGIWQLDCTHLEGKI ILVAVHV 
ASG Y I E AEV I PAETGSET A YFLLKL AGRUP VTT I HTDNGSNFTSTTVKA ACWHAG I K6 
EFG I PYNP9S9G VVESMNKELKK I I G9 VRD9 AEHLKTAV8NAVF I HNFKRKGG I GGYS 
AGERI IDI I ATDIQTKELQKQITKI9NFRVYYRDNRDPIHKGPAKLLHKGEGAVVI9D 
NSDIKVVPRRKVKI IRDYGKQMAGDDCVASRQDED 0 
CDS 790.. 2304 

/gene= n gag B 

/product=°gag polyprotein" 
/codon_5tart=l 

/tran5lation=°MGARASVLSGGELDRWEKIRLRPGGKKKYRLKHIVWASRELERF 
AVNPGLLESSEGCRQILGQLQPSLKTGSEELTSLYNTVATLYCVHQRIEIKDTKEALE 
KIEEEQTKSHKKA99AAADTGNSS9VS9NYPIV9NL9G9MVH9AISPRTLNAHVKVIE 
EKAFSPEVIPHFSALSEGATPflDLNTHLNTVGGHBAAHflHLKETIHEEAAEMDRLHPV 
HAGPIAPGQHREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWI ILGLNKIVRM 
YSPVSILDIRQGPKEPFRDYVDRFYKTLRAE9ATGEVKNWHTETLLV9NANPDCKTIL 
KALGPAATLEEMflTAC9GVGGPGHKARVLAEAf1S9VTNPATIHI19RGNFRN9RKNVKC 
FNCGKEGHIARNCRAPRKKGCHKCGKEGH9HKECTER9ANFLGKIHPSYKGRPGNFL9 
SRPEPTAPPEESFRFGEETATPS9K9EQK9EPIDKELYPLTSLRSLFGNDPSS9 0 
CDS 5053.. 5631 

/gene= n vif° 

/product=°vif protein" 
/codon_start=l 

/tr-ansl ation= n HENRWSVMIVyGVDRf1RIRTHNSLVKHHI1YISGKAKGHIYKHHY 
ESTNPRVSSEV9IPLGDARLVITTYWGLHTGERDHHLG9GVSHEHRTRRYST9VDPDL 
AD9LIHLYYFDCFSESAIRNAILGHIVSPRCEY9AGHSKVGSL8YLALTALIKPKKIK 
PPLPSVKKLTEDRHNKP9KTKGHRGSHT«NGH B 
CDS 5571.. 5861 



/gene=°vpR° 
/codon_start=l 

/translation= a REQAPED8GPQREPYNEHTLELLEELKNEAVRHFPRiyLHSLGQ 
Y1YETYGDTV3AGVEAI IRIL6QLLFIHFRIGCRHSRIGITR9RRARNGASRS 0 
CDS 6073.. 6318 

/gene= u vpU" 
/codon_start=l 

/tran5laiion=°H8PLQILAIVALVVAGIIAIIVySIVLIEYRKILR8RKIDRLID 
KIRERAEDSGNESEGDSEELSALVERGHLAPyDINDL 0 
CDS 6236.. 8782 

/gene=°env ,i 

/product= n envelope polyprotein" 
/codon_start=l 

/trans lation= n HRVKGIRKNYQHLyKGGILLLGTLHICSAVEKLWVTVYYGVPVW 
KETTTTLFCASDAKAYDTEVHNVyATHACVPTDPNPfiEVVLENVTEDFNHHKNNMVEQ 
RQEDVINLWDQSLKPCVKLTPLCVTLNCKDVNATNTTSSSEGMMERGEIKNCSFNITK 
S1RDKV8KEYALFYKLDVVPIDNKNHTKYRLISCNTSVITQACPKVSFEPIPIHYCAP 
AGFAILKCNNKTFNGKG8CKNVSTVQCTHGIRPVVST9LLLNGSLAEEKVVIRSDNFT 
DNAKTI IVQLNESVKINCTRPSNNTRKSIHIGPGRAFYTTGEI IGDIRGAHCNISRA9 
WNNTLKQIVEKLREQFNNKTIVFTHSSGGDPEIVMHSFNCGGEFFYCNSTQLFNSTHN 
DTEKSSGTEGNDTI ILPCRIK8I INHW8EVGKAMYAPPIKG8IRCSSNITGLLLTRDG 
GKNESEIEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVV8REKRAVGIG 
ALFLGFLGAAGSTHGARSHTLTV8AR8LLSGIV888NNLLRAIEA88HHL8LTVHGIK 
8L8ARVLAVERYLKD88LHGIWGCSGKLICTTAVPyNTSHSNKSLDSIWNN«TWMEWE 
KEIENYTNTIYTLIEESQIQQEKNEQELLELDKWASLUNWFGITKHLWYIKIFIMIVG 
GLIGLRIVFSVLSIVNRVR8GYSPLSF8TLLPATRGPDRPEGIEEEGGERDRDRSG8L 
VNGFLALiyVDLRSLFLFSYHRLRDLLLTVTRIVELLGRRGHEILKYH«NLL8YWS8E 
LKNSAVSLLMATAIAVAEGTDRI IEVVQRVYRAILHIPTRIRQGLERALL 0 
CDS 8784.. 9434 

/gene= n nef 9 
/codon_start=l 

/tran5tation=°HGGKySKHSVPGySTVRERHRRAEPATDRVR8TEPAAVGVGAVS 
RDLEKHGAITSSNTAATNADCAWLEAYEDEEVGFPVRP8VPLRPHTYKAAIDLSHFLK 
EKGGLEGLIYSQKRQDILDLyiYHTQGYFPDyQNYTAGPGVRFPLTFGMCFKLVPVDP 
EKVEEANEGENNCLLHPHS8HGHDDPEKEVLVWKFDSKLALHHVARELHPEYYKDC 8 

BASE COUNT 3425 a 1691 c 2308 g 2116 t 

ORIGIN 5' terninus of 5'LTR. 

Initial Score = 652 Optimized Score = 652 Significance = 51.03 
Residue Identity = 947. Matches = 652 Hisnatches = 38 

Gaps = 0 Conservative Substitutions = 0 

X 10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

minimum mi n mmiiiimmmiiimimimimm 

CTGGAAGGGCTAATTTACTCACAGAAAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

f 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 mi imiiimm mimiimiiiiiiim 

GGCTACTTCCCTGATTGGCAGAACTACACAGCAGGACCAGGGGTCAGATTTCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

i iiiiiiiiiiiiiimi inn mmimiiiimi iimimm i imimm 

TTCAAGCTAGTACCAGTTGATCCAGAGAAGGTAGAAGAGGCCAATGAAGGAGAGAACAACTGCTTGTTACAC 
140 150 160 170 180 190 200 210 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

m mm 1 1 1 1 1 1 1 1 1 1 1 1 inn mi iimmii iiiii miniiii mm 

CCTATGAGCCAGCATGGAATGGACGACCCAGAGAAGGAAGTGTTAGTGTGGAAGTTTGACAGCAAGCTAGCA 
220 230 240 250 260 270 280 



290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

ii iiiiimiiiiiiiiiiiiiiimmimi mi miiiiiii mini iiiinim 

TTGCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTACAAGGACTGCTGACACCGAGCTTTCTACAAGGGA 
270 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiiiiiiiiiiiimmiim miiiimimiiimimmm 

CTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

miimimmiiiiiiimiiiimiiiiiimmiiiiiii miiiiiiiimiiim 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

III lllllllllilllllllimiimilllllllllllllllllllllllllllllllllllllllll 
CTAGCTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

500 510 520 530 540 550 560 570 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

lllllllllllllllllllllllllllllllllllllllllllimillimilllllllllllllllll 
GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

minium miiiimi mimmiii iimm 

CCGAACAGGGACCGGAAAGCGAAAGAGAAACCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 690 700 710 

CGCACAGCAAGAGGCGAGGGGCGGCG 
720 730 740 



6. RAILEY-000-716.SE6 (1-696) 

HIVNY5CG Hunan innunodef iciency virus type 1» isolate NY5 



LOCUS 

DEFINITION 

ACCESSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
JOURNAL 
STANDARD 

COMMENT 



HIVNY5CG 9022 bp ss-RNA VRL 28-SEP-1992 

Hunan immunodeficiency virus type 1. isolate NY5i conplete genone. 
Infectious single LTR nolecular proviral genone. 
H38431 

m 

HIV-lr isolate NY5* unintegrated circular viral DNA. Infectious. 
Hunan ifinunodef iciency virus type 1 

Viridae; ss-RNA enveloped viruses; Positive strand RNA virus? 

RetroviridaeJ Lentivirinae. 

1 (bases 1 to 9022) 

Theodore* T. and Buckler-White* A. 

Unpublished (1988) 

full autoRatic 

Conputer-readable copy of sequence kindly provided by Chuck 
Buckler, 0I-N0V-1988. A partial sequence for NY5, isolated in 1984, 
is on page I-A-101 of this conpendiun and* as the 5 r half of the 
hybrid HIVNL43* also an infectious clone* on page I-A-64. 
Hirt Supernatant DNA extracted froR A3.01 cells infected with the 
NY5 HIV isolate stock yas digested yith EcoRI and cloned into 
lanbda WESB. The insert is an EcoRI pereuted single LTR clone and 
yas then transferred into pBR322. In the sequence beloy position 
one is the first base of the single LTR of the clone while the last 
base (9022) is the one iust before the LTR of the intact cir<-i° 



FEATURES Location/Qualifiers 
LTR 1. .634 

/partial 

protein_bind 377. .386 

/bound_noiety= u Spl° 
protein_bind 388. .397 

/bound_noiety= n Spl D 
protein_bind 399.. 408 

/bound_Roiety= u Spr 
protein_bind 636. .653 

/bound_Roiety= D Lys-tRNA° 
exon 5830.. 6044 

/nuflber=2 

/gene= n t3t° 
exon 5969.. 6044 

/nunber=2 

/gene=°rev 0 
exon 8316.. 8406 

/nunber=3 

/gene=°tat B 
exon 8316.. 8590 

/nuflber=3 

/gene= a rev° 
CDS <2085..5096 

/gene="pol n 

/codon_start=l 

/trans lation=°FFREDLAFPQGKAREFSSE9TRANSPTRRELQVUGRDNNSLSEA 
GADRSGTVSFSFP0ITLWQRPLVTIKIGG9LKEALLDTGADDTVLEEMNLPGRWKPKM 
IGGIGGFIKVR9YD9ILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPI 
ETVPVKLKPGHDGPKVKQyPLTEEKIKALVEICTEHEKEGKISKIGPENPYNTPVFAI 
KKKDSTKHRKLVDFRELNKRT9DFHEV9LGIPHPAGLK9KKSVTVLDVGDAYFSVPLD 
KDFRKYTAFTIPSIHNETPGIRY9YNVLP9GWKGSPAIF9CSHTKILEPFRK9NPDIV 
IYQYHDDLYVGSDLEIG9HRTKIEELRQHLLRNGFTTPDKKH9KEPPFLHHGYELHPD 
KWTV9PIVLPEKDSHTVNDI9KLVGKLN«AS9IYAGIKVRQLCKLLRGIKALTEVVPL 
TEEAELELAENREILKEPVHGVYYDPSKDLIAEI9K9G9G9WTY9IY9EPFKNLKTGK 
YARNKGAHTNDVKSLTEAV9KIATESIVIWGKTPKFKLPIQKETWEAWWTEYWQATWI 
PEyEFVNTPPLVKLyYQLEKEPIIGAETFYVDGAANRETKLGKAGYVTDRGRflKVVPL 
TDTTN8KTEL9AIHLALQDSGLEVNIVTDS9YALGII9A9P0KSESELVS9IIE6LIK 
KEKVYLAWVPAHKGIGGNE9VDKLVSAGIRKVLFLDGIDKA9EEHEKYHSNWRAHASD 
FNLPPVVAKEIVASCDKC9LKGEAdHG9VDCSPGIW9LDCTHLEGKVILVAVHVASGY 
IEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACWWAGIK9EFGI 
PYNP9SQGVIESHNKELKKIIG9VRD9AEHLKTAV8MAVFIHNFKRKGGIGGYSAGER 
IVDIIATDI9IKEL9K9ITKI9NFRVYYRDSRDPV«KGPAKLL«KGEGAVVI9DNSDI 
KVVPRRKAKI IRDYGKSHAGDDCVASRQDED 0 
CDS 790.. 2292 

/gene= fl gag B 
/codon_$tart=l 

/translation=°MGARASVLSGGELDK«EKIRLRPGGKK9YRLKHIVHASRELERF 
AVNPGLLETSEGCR9ILRQLQPSL9TGSEERRSLFNTVAVLYCVH9RIDVKDTKEALD 
KIEEE9NKSKKKA9QAAADTGMSS9VS0NYPIVQNL9G9!1VH9AISPRTLNAHVKVVE 
EKAFSPEVIPHFSALSEGATP9DLNTMLNTVGGH9AAH9HLKETINEEAAEHDRLHPV 
HAGPIAPG9MREPRGSDIAGTTSTL9E9IGWRTHNPPIPVGEIYKRWIILGLNKIVRM 
YSPTS1LDIR9GPKEPFRDYVDRFYKTLRAE9AS9EVKNWHTETLLV9NANPDCKTIL 
KALGPAATLEENHTACflGVGGPCHKARVLAEAHSaVTNPATIHIQRCNFRNaKKTVKC 
FNCGKEGHIAKNCRAPRKKGC9KCGKEGH6MKDCTER9ANFLGKIWPSHKGRPGNFL9 
SRPEPTAPPEESFRFGEETTTPS9K8EPIDKELYPLASLRSLFGSDPSS9 0 
CDS 5041. .5619 

/gene=°vif a 
/codon_start=l 

/translation= ,l HENRUaVHIVllBVDRHRINTHKRLVKHHHYISRKAKDUFYRHHY 
ESTNPKISSEVHIPLGDAKLVITTYWGLHTGERDHHLGQGVSIEWRKKRYST8VDPDL 
AD6LIHLHYFDCFSESAIRNTILGRIVSPRCEY9AGHNKVGSL8YLALAALIKPKQIK 
PPLPSVRKLTEDR«NKP9KTKGHRGSHTMNGH° 
CDS 5559.. 5849 

/gerie= a vpR 0 

/rorinn ^t.ar-isl 



/iransl aiion= a HEQAPEDQGPQREPYNEWTLELLEELKSEAVRHFPRIHLHNLGS 
HIYETYGDTWAGVEAI IRILQQLLFIHFRIGCQHSRIGIIR9RRARNGASRS 0 
CDS 6061 • .6147 

/gene^vpU" 
/codon_start=l 

/translation= D NLSFNSCVDHSVHRI8ESIKTKKSR8DN D 
CDS 6180.. 8732 

/gene= n env n 

/product= H envelope polyproiein 0 
/codon_start=l 

/trans lation= fl MRAKGTRKNYQHLWRWGTt1LLGMLMICSAAEQLWVTVYYGVPVW 
KEATTTLFCASDAKAYDTEVHNVHATHACVPTDPNPQEVVL8NVTENFNWHKNNHVEQ 
MHEDI ISLWDQSLKPCVKLTPLCVTLNCTDLTNATYANGSSEERGEIRNCSFNVTTI I 
RNKIQKEYALFYRLDIVPIDKDNTSYTLINCDTSVITQACPKVSFEPIPIHYCAPAGF 
AILKCNDKKFNGTGPCTNVSTVQCTHGIKPVVST6LLLNGSLAEGEVVIRSENFTNNV 
KTI IVQLNESVEINCTRPNNNTRKGIAIGPGRTLYAREKI IGDIRSAHCNLSRAKWND 
TLK8IVTKLKEQFRNKTIVFNQSSGGDPEIVMHSFNCGGEFFYCKTTQLFNSTHLFNS 
TWNDTERSDNNETI ILPCRIKQI INRW9EVGKAMYAPPISGGIRCSSNI7GLLLTRDG 
GDKENSTTEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVV8REKRAVGA 
LGALFLGFLGAAGSTRGAASMALTV8TR8LMSGIVQ8QNNLLKAIEA88HLL8LTVWG 
IK8LQARVLAVERYLKD8GLLRIWGCSGKLICTTTVPHNASWSNKSLDKIHDNMTHHE 
WEREIDNYTGLIYTLIEES8I88EKNE8ELLELDKWASLHNWFDITKHLWYIKIFIMI 
VGGLIGLRIVFTVLSIVNRVR8GYSPLSF8TRLPA8RGPDRPEGIEEEGGERDRDRSG 
PLVNGFLALIWVDLRSLCLFSYHRLRDLLLIVARIVELLGRRGWEALKYCHNLL8YWG 
QELKNSAISLLNATAVAVAEGTDRVIEVA8RICRGILHIPRRIR8GLERLLL D 
CDS 8734.. 9022 

/gene= B nef " 
/partial 
/codon_start=l 

/trans 1 at ion= B MGGKWSKRSHSGWPAVRERMKRTEPAADGVGAVSRDLEKHGAIT 
SSNTAATNANCAWLEAHEEEEVGFPVRPQVPLRPITRKAAMDLSHFLKEEGGX 0 

BASE COUNT 3230 a 1591 c 2188 g 2013 t 

ORIGIN 5'terpiinus of 5'LTR (start of U3) 

Initial Score = 650 Optimized Score = 650 Significance = 50.86 
Residue Identity = 94% Hatches = 650 Wisnatches = 39 

Gaps = 0 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

imiiillllil! Illlll lllllllllllllllllllllllllllllllflllllllll 
TGGAAGGGCTAATTTGGTCCCAAAGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

lllllllllflllllllllllllillllllilllllllllll lllllllllilllllllllllllllllll 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 
70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

i mi iiiiiiiiiiiiiiiiii iiiiiiiiiiiiiiiii minium immmm 

TTCAAGTTAGTACCAGTTGAGCCAGGGCAGGTAGAAGAGGCCAATGAAGGAGAGAACAACAGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

iii iiiiii mm urn mn m imm mi iiiii iimiimi mini 

CCTATGAGCCAGCATGGGATGGAGGACCCGGAGGGAGAAGTATTAGTGTGGAAGTTTGACAGCCTCCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

Mil llll 1 1 1 1 1 1 1 1 1 ! I ! 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 
TTTCGTCACATGGCCCGAGAGCTGCATCCGGAGTACTACAAAnAflTCCTnACATrnAfirTTTCTACAAnfiCA 



290 300 310 320 330 340 350 



370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

MINIMUM MMMMMMM i 1 1 1 1 1 1 1 ! 1 1 i I MM i M I M M M I M M M M M M M M 

CTTTCCGCTGGGGACTTTCCAGGGAGGTGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

MMMMMMMMMMMMMMMMMMMMMMMMM! MMIMMMMMMMI 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

111 IIIIIIIIIIIIIIIIINIIIIIIIIIIIIIIIIIIIIIIIIIIII llllllllllllllllllll 
CTAGCTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTACAAGTAGTGTGTGCCCGTCT 
500 510 520 530 540 550 560 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

IIIIIIMIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIillllllllllllllllllllllllllll 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

MMMMMMMM llllllll II lllilllll llllllll 

CCGAACAGGGACTTGAGAGCGAAAGTAAAGCCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 690 700 710 

CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



7. RAILEY-00G-716.SE8 (1-696) 

HIVNL43 Hunan inmunodef iciency virus type lr NY5/BRU (LAV- 



LOCUS 

DEFINITION 

ACCESSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
STANDARD 

REFERENCE 
AUTHORS 
JOURNAL 
STANDARD 

REFERENCE 
AUTHORS 
JOURNAL 
STANDARD 

REFERENCE 



HIVNL43 9709 bp ss-RNA VRL 15-JUN-1989 

Hunan inmunodef iciency virus type h NY5/BRU (LAV-1) recombinant 

clone pNL4-3. 

H19921 

* 

Hunan immunodeficiency virus type 1 (HIV-1), NY5/BRU (LAV-1) 

recombinant clone pNL4-3. 

Hunan innunodef iciency virus type 1 

Viridae; ss-RNA enveloped viruses? Positive strand RNA virus? 
Retroviridae; Lentivirinae. 

1 (bases 1 to 9709) 

AdachirA.r GendelnaniH.E. » KoenigrS.r FolkstT.r Hi I ley f R. r 
RabsoniA. and HartiniH.A. 

Production of acquired innunodef iciency syndrome-associated 
retrovirus in human and nonhuman cells transfected with an 
infectious molecular clone 
J. Virol. 59, 284-291 (1986) 
full automatic 

2 (bases 1 to 9709) 

Buckler»C.E.r Buck ler-Whiter A.J. r Hilley»R.L. and NcCoy»J. 
Unpublished (1988) . 
full automatic 

3 (sites) 
BucUer,C.E. 
Unpublished (1988) 
full automatic 

4 (sites) 



AUTHORS DairL.C.t Littaua»R.i TakahashirK. and Ennis.F.A. 

TITLE Mutation of huRan immunodeficiency virus type 1 at amino acid 585 

on gp41 resultis in loss of killing by CD8+ A24-restricted 

cytotoxic T lymphocytes 
JOURNAL J. Virol. 66, 3151-3154 (1992) 
STANDARD full automatic 
COMMENT t 33 sites; revisions of [3). 



Clean copy of sequence [3] kindly provided by Chuck Buckler* NIAIDi 
Bethesda, MD, 24-JUN-1988. The construction of pNL4-3 has been 
described in [11. pNL4-3 is a recombinant (infectious) proviral 
clone that contains DNA from HIV isolates NY5 (5' half) and BRU (3' 
half). The site of recombination is the EcoRI site at positions 
5743-5748. 



The length and sequence of the vpr coding region corresponds to 
that of the BRU, SC, SF2r HAL and ELI isolates. The vpr coding 
region of these isolates is about 18 amino acid residues longer 
than the vpr coding region of the 1 1 lb isolates. In HIVNL43, this 
shift is due to a single base deletion (with respect to the Illb's) 
at position 5770. The sequence at this position is 'atttc' in 
HIVNL43 and 'attttc' in HIVHXB2. 



The original BRU clone* sequenced by Hain-Hobson, et al. (Cell 40, 
9-17 (1985) )f and the BRU portion of the pNL4-3 recombinant clone 
are different clones from the sane BRU isolate. 



FEATURES 
LTR 



intron 

misc_feature 
Rise recomb 



Tuo of the revisions reported in the FEATURES produced changes in 
anino acid sequences. The revision at position 2421 changes one 
aRino acid residue frofl 'R' to 'G F in the pol coding region. The 
revision at positions 8995-9000 changes three amino acid residues 
from 'AHT' to 'VTP' in the nef coding region. 
Location/Qualifiers 
1..634 
/note=°5' LTR° 
repeat_region 454. .550 

/note=°R repeat 5' copy 8 
prim_transcript 455.. 9626 

/note="tat» rev, nef subgenonic rRNA° 
744.. 5776 

/note=°tat, revi nef nRNA intron 1° 
5743.. 5748 

/note= D EcoRI site of recombination 11 
5743.. 5744 

/note=°HIV-l isolate NY5 DNA end/HIV-1 isolate LAV DNA 
start* 
6045.. 8368 

/note= Q tat cds intron 2 n 
6045.. 8368 

/note=°rev cds intron 2° 
6045.. 8368 

/note= D tat, rev* nef rRNA intron 2° 
9076.. 9709 
/note= D 3' LTR U 
9529.. 9626 

/note=°R repeat 3' copy" 
9602.. 9607 

/note= n RRNA polyadenlyation signal 0 
join(5830. .6044,8369. .3414) 
/note=°tat protein 0 
/codon_start=l 

/translation= n MEPVDPRLEPWKHPGS9PKTACTNCYCKKCCFHCQVCFMTKALG 
ISYGRKKRR8RRRAHQNSQTH8ASLSKGPTSQSRGDPTGPKE D 
join(5969..6044r8369..8643) 
/note=°rev protein 0 
/codon start= 1 



intron 
intron 
intron 
LTR 

repeat_region 

polyA_signal 

CDS 



CDS 



/trans I at ion= "HAGRSGDSDEEL I RTVRLIKLLY8SNPPPNPEGTR8ARRNRRRR 
HRER8R8IHSISERILSTYLGRSAEPVPL8LPPLERLTLDCNEDCGTSGT8GVGSP8I 
LVESPTVLESGTKE" 
CDS 790.. 2292 

/note="gag polyprotein" 
/codon_start=l 

/translation="HGARASVLSGGELDKWEKIRLRPGGKK8YKLKHIVUASRELERF 
AVNPGLLETSEGCR8ILG8L8PSL8TGSEELRSLYNTIAVLYCVH8RIDVKDTKEALD 
KIEEE8NKSKKKAQ8AAADTGNNS8VS8NYPIV8NL8G8HVH8AISPRTLNAWVKVVE 
EKAFSPEVIPHFSALSEGATP8DLNTHLNTVGGH8AAH8HLKETINEEAAEWDRLHPV 
HAGPIAPG8HREPRGSDIAGTTSTL8EQIGUHTHNPPIPVGEIYKRWI1LGLNKIVRM 
YSPTSILDIR6GPKEPFRDYVDRFYKTLRAE8AS8EVKNMMTETLLV8NANPDCKTII 
KALGPGATLEEHHTAC8GVGGPGHKARVLAEANS8VTNPATIHI8KGNFRN8RKTVKC 
FNCGKEGHIAKNCRAPRKKGCWKCGKEGH8HKDCTER8ANFLGKIWPSHKGRPGNFL8 
SRPEPTAPPEESFRFGEETTTPS8K8EPIDKELYPLASLRSLFGSDPSS8" 
CDS 2085.. 5096 

/partial 

/note="pol polyprotein; <NH2-terninus uncertain)" 
/codon_start=l 

/translation=°FFREDLAFP8GKAREFSSE8TRANSPTRREL8VWGRDNNSLSEA 
GADR8GTVSFSFP8 1 TLH8RPL VT I K I GG8LKEALLDTGADDTVLEEI1NLPGRHKPKH 
IGGIGGFIKVG8YD8ILIEICGHKAIGTVLVGPTPVNIIGRNLLT8IGCTLNFPISPI 
ETVPVKLKPGHDGPKVK8HPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAI 
KKKDSTKHRKLVDFRELNKRT8DFHEV8LGIPHPAGLK8KKSVTVLDVGDAYFSVPLD 
KDFRKYTAFTIPSINNETPGIRY8YNVLP8GHKGSPAIF8CSHTKILEPFRK8NPDIV 
IY8YMDDLYVGSDLEIG8HRTKIEELR8HLLRWGFTTPDKKH8KEPPFLHHGYELHPD 
KUTV8PIVLPEKDSUTVNDI8KLVGKLNMAS8IYAGIKVR8LCKLLRGTKALTEVVPL 
TEEAELELAENREILKEPVHGVYYDPSKDLIAEI8K8G8G8WTY8IY8EPFKNLKTGK 
YARHKGAHTNDVK8LTEAV8KIATESIVIHGKTPKFKLPI8KETHEAWWTEYH8ATHI 
PEWEFVNTPPLVKLHY8LEKEPIIGAETFYVDGAANRETKLGKAGYVTDRGR8KVVPL 
TDTTN8KTEL8AIHLAL8DSGLEVNIVTDS8YALGII8A8PDKSESELVS8IIE8LIK 
KEKVYLAWVPAHKGIGGNE8VDGLVSAGIRKVLFLDGIDKA8EEHEKYHSNWRAHASD 
FNLPPVVAKEIVASCDKC8LKGEAHHG8VDCSPGIW8LDCTHLEGKVILVAVHVASGY 
IEAEVIPAETG8ETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACHWAGIK8EFGI 
PYNP8S8GVIESNNKELKKIIG8VRD8AEHLKTAV8MAVFIHNFKRKGGIGGYSAGER 
IVDIIATDI 8TKEL8K8 1 TK 1 8NFRVY YRDSRDPVHKGPAKLLHKGEGA VV I8DNSDI 
KVVPRRKAKIIRDYGK8NAGDDCVASR8DED" 
CDS 5041.. 5619 

/note="vif protein" 
/codon_5tart=l 

/tr ans I at ion= " HENRH8VM I VU8 VDRHR I NTHKRLVKHHHY I SRKAKDUFYRHHY 
ESTNPK I SSEVH I PLGDAKLV I TT YWGLHTGERDHHLG8GVSIEWRKKRYST8VDPDL 
AD8L I HLHYFDCFSESA I RNT I LGR I VSPRCE Y8AGHNKVGSL8YLALA ALIKPK8IK 
PPLPSVRKLTEDRHNKP8KTKGHRGSHTMNGH" 
CDS 5559.. 5849 

/note=°vpr protein" 
/codon_5tart=l 

/ tr ans 1 at i on= °HE8 APED8GP8REPYNEHTLELLEELKSEAVRHFPR I HLHNLG8 
HIYETYGDTHAGVEAIIRIL8GLLFIHFRIGCRHSRIGVTRGRRARNGASRS 0 
CDS 6061.. 6306 

/note =, "vpu protein 0 
/codon_start=l 

/translation=°M8PIIVAIVALVVAIIIAlVVHSIVIIEYRKILR8RKIDRLIDR 
L IER AEDSGNESEGEVSALVEMGVEHGHHAPHD IDDL " 
CDS 6221.. 8785 

/note= a envelope polyprotein" 
/codon_start=l 

/translation= D MRVKEKY8HLyRWGHKUGT«LLGILHICSATEKLHVTVYYGVPV 
WKEATTTLFCASDAKAYDTEVHNVHATHACVPTDPNP8EVVLVNVTENFNMHKNDHVE 
QMHED1 1SLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN 
ISTSIRDKV8KEYAFFYKLDIVPIDNTSYRLISCNTSVIT8ACPKVSFEPIPIHYCAP 
AGFAILKCNNKTFNGTGPCTNVSTV8CTHGIRPVVST8LLLNGSLAEEDVVIRSANFT 
DNAKTIIV8LNTSVEINCTRPNNNTRKSIRI8RGPGRAFVTIGKIGNNR8AHCNISRA 
KHNATLKQIASKLRE8FGNNKTIIFK8SSGGDPEIVTHSFNCGGEFFYCNST8LFNST 
HFNSTySTEGSNNTEGSDT I TLPCRI K8F I NHH8E VGKAHY APPI SG8 1 RCSSN I TGL 

LLTRnnCNNNNnRFIFRPr.RnnHRnUUR<;FI YKYKVUKTFPI fiVAPTKAKRRVVflRFK 



RAVGIGALFLGFLGAAGSTHGCTSMTLTVSARQLLSDIVQ9QNNLLRAIEASQHLL9L 
TVWGIKQL9ARILAVERYLKDQQLLGI WGCSGKLICTTAVPWNASWSNKSLEQIWNNfl 

TMHEUDREINNYTSLIHSLIEES9NG9EKNEQELLELDKHASLWNWFNITNWLWYIKL 
FIHIVGGLVGLRIVFAVLSIVNRVRQGYSPLSF8THLPIPRGPDRPEGIEEEGGERDR 
DRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGHEALKYHUNLL 
QYHS8ELKNSAVNLLNATAIAVAEGTDRVIEVLQAAYRAIRHIPRRIRQGLERILL 0 
CDS 8787.. 9407 

/note= Q nef protein 0 
/codon_start=l 

/transl3tion= l, MGGKHSKSSVIGHPAVRER«RRAEPAADGVGAVSRDLEKHGAIT 
SSNTAANNAACAWLEAQEEEEVGFPVTP9VPLRPMTYKAAVDLSHFLKEKGGLEGLIH 
SQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGMCYKLVPVEPDKVEEANKGE 
NTSLLHPVSLHGMDDPEREVLEURFDSRLAFHHVARELHPEYFKNC 0 

BASE COUNT 3421 a 1756 c 2366 g 2166 t 

ORIGIN 5' terminus of NY5 LTR 

Initial Score = 645 Optimized Score = 645 Significance = 50.45 
Residue Identity = 937. Hatches = 645 Hisnatches = 44 

Gaps = 0 Conservative Substitutions = 0 

10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

iiiiiiiiiiiiii nun iiiiinii mmmmiiimiiimmm 

TGGAAGGGCTAATTTGGTCCCAAAAAAGACAAGAGATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

lllllllllllllllllllillllllllllllllllllllll lllllllllllllllllllllllllllll 
GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGC 

70 80 90 100 110 120 130 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

i mi iiiuiiiiiiii inn i Minimum iiiimm i immiimi 

TTCAAGTTAGTACCAGTTGAACCAGAGCAAGTAGAAGAGGCCAAATAAGGAGAGAAGAACAGCTTGTTACAC 
140 150 160 170 180 190 200 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

iii linn mm mn urn m imm mi mil Minium mini 

CCTATGAGCCAGCATGGGATGGAGGACCCGGAGGGAGAAGTATTAGTGTGGAAGTTTGACAGCCTCCTAGCA 
210 220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

mi mi mimiiiiimiimmmi m mmmmimii iiiimm 

TTTCGTCACATGGCCCGAGAGCTGCATCCGGAGTACTACAAAGACTGCTGACATCGAGCTTTCTACAAGGGA 
290 300 310 320 330 340 350 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

miiiiiiiii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 immmm mmiimimimiiiiimi i 

CTTTCCGCTGGGGACTTTCCAGGGAGGTGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTAC 
360 370 380 390 400 410 420 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

i 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1! I i 1 1 1 1 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
430 440 450 460 470 480 490 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

1 1 1 1 1 1 1 1 1 f i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 i 1 1 1 1 1 miimiiiiiiiim 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTG ' 



500 510 520 530 540 550 560 



580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

llllllllllllilliiMlllillMMIilllillllilllMliMliilllMlllllllllltllll 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
570 580 590 600 610 620 630 640 

650 660 670 680 690 X 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

miiiiimiiiimiiiim n mmm mum 

CCGAACAGGGACTTGAAAGCGAAAGTAAAGCCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAGCG 
650 660 670 680 690 700 710 

CGCACGGCAAGAGGCGAGGGGCGGCG 
720 730 



8. RAILEY-000-716.SE8 (1-696) 

AIHTLV31 Hunan t-cell leukenia virus type iii provirusi 5' 

ID AIHTLV31 standard; RNA; VRL; 660 BP. 
XX 

AC K02003; 
XX 

DT 13-JUN-1985 (Rel. 06, Created) 

DT ll-AUG-1990 (Rel. 25, Last updated. Version 1) 

XX 

DE Hunan t-cell leukenia virus type iii provirusi 5 1 Itr fron hxb2 
XX 

KW acquired innune deficiency syndrome; long terminal repeat; 

KW provirus. 

XX 

OS Hunan innunodef iciency virus type 1 

0C Viridae? ss-RNA enveloped viruses; Positive strand RNA viruses; 

0C Retroviridae? Lentivirinae. 

XX 

RN [11 

RP 1-660 

RA Starcich B.» Ratner L., Josephs S.F., Okanato T.f Gallo R.d 

RA Wong-staal F.; 

RT ""Characterization of long terninal repeat sequences of HTLV-III B ? 

RL Science 227:538-540(1985). 

XX 

CC Acquired innune deficiency syndrone (aids) is caused by a 

CC retrovirus known by four different nanes, probably representing 

CC four different strains: hunan t-cell leukenia virus-iii (htlv-iii)r 

CC aids-associated retrovirus type 2 (arv-2), aids virus, and 

CC lynphadenopathy-associated virus (lav), it is still unclear with 

CC uhich type of virus it is nost closely associated. 

CC 

CC the Itr has u3, r» and u5 regions of 453? 98» and 83 bp? 

CC respectively, this sequence has sone regions honologous to hunan 

CC t-cell growth factor (tcgf)r and the u3 region shows 837. homology 

CC with intron 1 of hunan ganna-interferon (ganna-if) [13; they 

CC conclude that the regions in the htlv-iii Itr uhich correspond to 

CC regions in tcgf and ganna-if could be inportant in host cell 

CC tropisn of transcriptional regulation of this virus. 

XX 

FH Key Location/Qualifiers 

FH 

XX 

SO Sequence 660 BP; 160 A; 159 C; 187 G; 154 T? 0 other; 

Initial Score = 644 Optimized Score = 645 Significance = 50.37 

Residue Identitu = 97? Hitches = 645 Hi snathes = IS 



Gaps 



0 Conservative Substitutions 



0 



X 10 20 30 40 50 60 70 

GGGGGACTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 

1 llllllllllllllllllllllllilllllllllllllillllllllllllllllllllilllll 

TAGTAGTTGGAAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAA 
X 10 20 30 40 50 60 70 

80 90 100 110 120 130 140 

GGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 

iiiimimiim iiiiiiiMiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

GGCTACTTCCCTGATTAGCAGAACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGC 
80 90 100 110 120 130 140 

150 160 170 180 190 200 210 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 

MMIilllliiillllilllllillllllllltlMIMillllllMlllitlMiMlillllltllll 

TACAAGCTAGTACCAGTTGAGCCAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACAC 
150 160 170 180 190 200 210 

220 230 240 250 260 270 280 

CCTGTGAGCCTGCATGGAATGGATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 

miiimiimm iiiniiiin iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

CCTGTGAGCCTGCATGGGATGGATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCA 
220 230 240 250 260 270 280 

290 300 310 320 330 340 350 360 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGA 

lllll!llllll!lli!!MI!!!!llliIlllllllll!i!lllllllli 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TTTCATCACGTGGCCCGAGAGCTGCATCCGGAGTACTTCAAGAACTGCTGATATCGAGCTTGCTACAAGGGA 
290 300 310 320 330 340 350 360 

370 380 390 400 410 420 430 

CTTTCCGCTGGGCACTTTCCAGGGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGC 

lilflimil! 1 1 ! 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 llllllllllllllflllllllll! Illl 
CTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGC 
370 380 390 400 410 420 430 

440 450 460 470 480 490 500 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGG 

IMIIIIIIIilMIIIIIIMIIlllllMlllllllllllllillllll imilllimilllllll 

ATATAAGCAGCTGCTTTTTGCCTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGG 
440 450 460 470 480 490 500 

510 520 530 540 550 560 570 

CTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 

III lllllllllllilllllllillllllllllllllllllllllilllllllllllllllllllllllll 

CTAGCTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT 
510 520 530 540 550 560 570 

580 590 600 610 620 630 640 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 

IIHIilMIMilillliMMilllllilMlllllliiMlillMMIiMlllitliillillilll 

GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGC 
580 590 600 610 620 630 640 

650 X 670 680 690 

CCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGA 

miiiiiiiii 

CCGAACAGGGAC 
650 660 



9. RAILEY-000-716.SEQ (1-696) 

REHIVXB2 Hunan T-lynphotropic virus type III (HTLV-III) 3'0 



LOCUS 

DEFINITION 

ACCESSION 

KEYWORDS 

SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
STANDARD 
COMMENT 



REHIVXB2 923 bp RNA VRL 01-JUN-1992 

Hunan T-lymphotropic virus type III (HTLV-III) 3'0RF HXB2 RNA 
X03187 

acquired immune deficiency syndrome? long terminal repeat? 
provirus; unidentified reading frame. 
Aids-associated retrovirus 
Aids-associated retrovirus 

Viridae; ss-RNA enveloped viruses; Positive strand RNA viruses? 

Retroviridae. 

1 (bases 1 to 923) 

Ratner,L.» StarcichrB.» Josephs>S.F. t Hahn*B.H.» ReddyrE.P.» 
LivakiK.J.i Petteyay»S.R. Jr. 7 PearsoniM.L- r HaseltineiW.A. > 
Arya>S.K. and Wong-staal iF. 

Polymorphism of the 3' open reading frame of the virus associated 
uith the acquired immune deficiency syndrome* human T-lymphotropic 
virus type III 

Nucleic Acids Res. 13, 8219-8229 (1985) 
full automatic 

^source: clone_library=lambda gtues-lambda b; *source: clone=HXB2? 



FEATURES 

misc_feature 

misc_feature 

misc_feature 

misc_feature 

CDS 



BASE COUNT 
ORIGIN 



Clone HXB2 uith a termination codon at amino acid residue 124 gives 
rise to viral particles and cytopathic effects* and thus appears to 
be a fully functional clone. The N terminal portion of the 3' ORF 
protein product may include the functional region of the molecule. 
HXB2 represents an integrated proviral clone? author numbering 
refers to viral cap site at pos. +1. see x03287 - x03292. 
Location/Qualifiers 
288.-923 
/note= B 3' LTR" 
283.. 742 

/note= n U3 sequence 0 
743,. 840 

/note="R sequence 0 
841.. 923 

/note="U5 sequence" 
1 . .36? 

/note= n 3' ORF? <aa 1-123) n 
/codon_start=l 

/translation= u NGGKWSKSSVIGHPTVRERMRRAEPAADGVGAASRDLEKHGAIT 
SSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIH 
S8RR6DILDLUIYHTQGYFPD 0 
249 a 207 c 262 g 205 t 



Initial Score = 
Residue Identity = 
Gaps = 



631 Optimized Score = 631 Significance = 49.31 
98X Matches = 631 Mismatches = 10 

0 Conservative Substitutions = 0 



X 10 20 

GGGGGACTGGAAGGGCTAATTC 

lilllillllllllllilllll 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
240 250 260 270 280 X 290 300 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

llllllll Mill 

ACTCCCAAAGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTAGCAGA 
310 320 330 340 350 360 370 



100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

INIIMIIIIIIIilliiillllilMIIIMIIIIIIIIIIIIIIilllllllllillllllllllllli 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 
380 390 400 410 420 430 440 



170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

mum iiiiiiiiniiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii mi 

CAGATAAGATAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGGATGG 
450 460 470 480 490 500 510 520 

240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

HIIIH IIII1IIIIIIIIIIII1II1IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
ATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 
530 540 550 560 570 580 590 

320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii immii 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
600 610 620 630 640 650 660 

390 400 410 420 430 440 450 

GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

mimiiiiimmi immmmmmmm iniiiiiiiiiiiiiiiiiiiiiii 

GGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGCATATAAGCAGCTGCTTTTTGCC 
670 680 690 700 710 720 730 

460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGCAACCCACTGCTT 

iiiiiiiiiimiiHMiiiiiimi mimiiimiiimiiiiiiii iiiiiiimim 

TGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAAGGAACCCACTGCTT 
740 750 760 770 780 790 800 

530 540 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

IMIiii!i!INill!!!Ml!i!ltl!MI!iMlliMIIIIIM!MIIIIIMIII!llliiMIII 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 
810 820 S30 840 850 860 870 880 

600 610 620 630 640 650 660 670 

AGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGCCCGAACAGGGACTTGAAAGCGA 

iMiiiiiiiiiiiiiiiiiiiiiininiiiiiiiiiiiii 

AGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCA 
390 900 910 920 X 

680 690 
AAGGGAAACCAGAGGAGCTCT 



10. RAILEY-000-716.SEQ (1-696) 

REHIVXB3 Hunan T-lynphotropic virus type III (HTLV III) 3 1 



LOCUS 

DEFINITION 

ACCESSION 

KEYWORDS 

SOURCE 
ORGANISE 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 



REHIVXB3 923 bp RNA VRL 01-JUN-1992 

Hunan T-lynphotropic virus type III (HTLV III) 3' ORF HXB3 RNA 
X03188 

acquired innune deficiency syndrome; long terminal repeat? 
provirus; unidentified reading frane. 
Aids-associated retrovirus 
Aids-associated retrovirus 

Viridae? ss-RNA enveloped viruses? Positive strand RNA viruses? 

Retroviridae. 

1 (bases 1 to 923) 

Ratner*L.r Starcich,B.* Josephs>S.F. * HahnrB.H.* Reddy*E.P.> 
LivakiK.J.* PetteuayiS.R. Jr. , PearsoniH.L. i Haseltine, y.A. > 
AryaiS.K. and Wong-staaliF. 

PolyRorphisn of the 3' open reading frane of the virus associated 
uith the acquired innune deficiency syndronei hunan T-lynphotropic 
virus type III 

Nucleic Acids Res. 13» 8219-8229 (1985) 



STANDARD full automatic 
COHHENT ^source? clone_library=lanbda gi ues-laabda b; ^source: clone=HXB3> 



HXB3 represents an integrated proviral clone; see x03187 - x03190," 

author nuabering refers to viral cap site at pos. +1. 
FEATURES Loc at i on/ Qua I i f i er s 

nisc_feature 288. .923 

/note= n 3' LTR n 
nisc_feature 288.. 742 

/note= B U3 sequence 8 
flisc_feature 743.. 840 

/note=°R sequence" 
nisc_feature 841.. 923 

/note=°U5 sequence" 
CDS 1 . .618 

/note= u 3' ORFi (aa 1-206) n 

/codon_start=l 

/ tr an s I at i on= u HGGKWSKSS V VGWP AVRERMRRAEPAADG VG A ASRDLEKHGA I T 
SSNTAANNAACAWLEAQEEEKVGFPVTPQVPLRPRTYKAAVDLSHFLKEKGGLEGLIH 
SQRRQDILDL«IYHTSGYFPDyQWYTPGPGIRYPLTFG«RYKLVPVEPEKLEEANKGE 
NTSLLHPVSLHGMDDPEREVLEHRFDSRLAFHHVARELHPEYFKNC 0 

BASE COUNT 252 a 208 c 260 g 203 t 

ORIGIN 



Initial Score = 626 Optimized Score = 626 Significance = 48.90 
Residue Identity = 97X Hatches = 626 Misftatches = 15 

Gaps = 0 Conservative Substitutions = 0 

X 10 20 

GGGGGACTGGAAGGGCTAATTC 

iiiiiMiiiiimiMiiii 

CAATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTC 
240 250 260 270 280 X 290 300 

30 40 50 60 70 80 90 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 

il!IM!lillli!li!ll!llliM!lMI!tlliilli!l!l!ll!ili!ll!IIMIIIIMIIiltll 

ACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGCTACTTCCCTGATTGGCAGA 
310 320 330 340 350 360 370 

100 110 120 130 140 150 160 

ACTACACACCAGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGC 

iiiiiiiiniii mini i miimiiimmmiiii iiiiiiiiiiiiiniiiiiiiii 

ACTACACACCAGGACCAGGGATAAGATATCCACTGACCTTTGGATGGCGCTACAAGCTAGTACCAGTTGAGC 
380 390 400 410 420 430 440 



170 180 190 200 210 220 230 

CAGATAAGGTAGAAGAGGCCAATAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 

till III lllllll III!) Illllllllllllllllllllllllllllllllllllllllllllllll 
CAGAGAAGTTAGAAGAAGCCAACAAAGGAGAGAACACCAGCTTGTTACACCCTGTGAGCCTGCATGGAATGG 
450 460 470 480 490 500 510 520 



240 250 260 270 280 290 300 310 

ATGACCCTGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 

lllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
ATGACCCGGAGAGAGAAGTGTTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACGTGGCCCGAGAGC 
530 540 550 560 570 580 590 



320 330 340 350 360 370 380 

TGCATCCGGAGTACTTCAAGAACTGCTGACATCGAGCTTGCTACAAGGGACTTTCCGCTGGGCACTTTCCAG 

iiiiiiiiiiiiiiiiiiiiniiiiiii iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iimmi 

TGCATCCGGAGTACTTCAAGAACTGCTGATATCGAGCTTGCTACAAGGGACTTTCCGCTGGGGACTTTCCAG 
600 610 620 630 640 650 660 

390 400 410 4?0 430 440 4550 



GGAGGCGTGGCCTGGGCGGAACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTTTGCC 

iiiimiimiiiim iiiiiiiiniiiiiiiiiiiiiii iiiiiiiiiiiiiiniiiiiiiiii 

GGAGGCGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATCCTGCATATAAGCAGCTGCTTTTTGCC 
670 680 690 700 710 720 730 



460 470 480 490 500 510 520 

TGTACTGGGTCTCTCTGGTTAGACCAGATTTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT 

lllllllimillllllimillllll lllllllliillllltlllllllllll llllllllllllll 
TGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAAGGAACCCACTGCTT 

740 750 760 770 780 790 800 



530 540 550 560 570 580 590 

AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAG 

810 820 830 840 850 860 870 880 



600 610 620 630 640 650 660 670 

AGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGCCCGAACAGGGACTTGAAAGCGA 

iiiiimmimiiiiiiiiiiiiiiiiiiimiimi 

AGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCA 
890 900 910 920 X 



680 690 
AAGGGAAACCAGAGGAGCTCT 



